Reality check: making indoor smartphone-based augmented reality work

Reality check: making indoor smartphone-based augmented reality work

Through extensive real-world experiments, researchers from Osaka University have identified the main barriers hindering smartphone-based augmented reality in indoor settings and propose ways to overcome them

Nov 22, 2024Engineering

Smartphone-based augmented reality, in which visual elements are overlaid on the image of a smartphone camera, are extremely popular apps. These apps allow users to see how furniture would look in their house, or navigate maps better, or to play interactive games. The global phenomenon Pokémon GO, which encourages players to catch digital creatures through their phone, is a well-known example.

However, if you want to use augmented reality apps inside a building, prepare to lower your expectations. The technologies available now to implement augmented reality struggle when they can't access a clear GPS signal. But after a series of extensive and careful experiments with smartphones and users, researchers from Osaka University have determined the reasons for these problems in detail and identified a potential solution. The work was recently presented at the 30th Annual International Conference on Mobile Computing and Networking.

"To augment reality, the smartphone needs to know two things," says Shunpei Yamaguchi, the lead author of the study. "Namely, where it is, which is called localization, and how it is moving, which is called tracking."

To do this, the smartphone uses two main systems: visual sensors (the camera and LiDAR) to find landmarks such as QR codes or AprilTags in the environment, and its inertial measurement unit (IMU), a small sensor inside the phone that measures movement.

To understand exactly how these systems perform, the research team set up case studies such as a virtual classroom in an empty lecture hall and asked participants to arrange virtual desks and chairs in an optimal way. Overall, 113 hours of experiments and case studies across 316 patterns in a real-world environment were performed. The aim was to isolate and examine the failure modes of AR by disabling some sensors and changing the environment and lighting.

"We found that the virtual elements tend to 'drift' in the scene, which can lead to motion sickness and reduce the sense of reality," explains Shunsuke Saruwatari, the senior author of the study. The findings highlighted that visual landmarks can be difficult to find from far away, at extreme angles, or in dark rooms; that LiDAR doesn’t always work well; and that the IMU has errors at high and low speeds that add up over time.

To address these issues, the team recommends radio-frequency–based localization, such as ultra-wideband (UWB)-based sensing, as a potential solution. UWB works similarly to WiFi or Bluetooth, and its most well-known applications are the Apple AirTag and Galaxy SmartTag+. Radio-frequency localization is less affected by lighting, distance, or line of sight, avoiding the difficulties with vision-based QR codes or AprilTag landmarks. In the future, the researchers believe that UWB or alternative sensing modalities like ultra-sound, WiFi, BLE, or RFID have the potential for integration with vision-based techniques, leading to vastly improved augmented reality applications.

20241122_1_fig_1.png

Fig. 1
Instances of AR failure: (a) Over a period of few frames, virtual objects drift in the virtual world due to tracking failure. (b, c) Virtual objects (tables and chairs) shift globally due to localization failure. In (b), the table is misaligned to the tape, and in (c), the highlighted chair is misaligned to the table.

Credit: 2024 Yamaguchi et al., Experience: Practical Challenges for Indoor AR Applications, ACM MobiCom '24

20241122_1_fig_2.png

Fig. 2
End-to-end experiment: (a) The task is to set up a new classroom in an empty room. Virtual tables and chairs (red dotted boxes) are placed via AR using an iPhone. The space is visualized from different angles to ensure fire safety, accessibility, and visibility of the front for all students. (b) The responses of 17 subjects who arranged virtual furniture in the bright and dark rooms. Each red line and square shows the mean and interquartile range of the scores. They felt less task completion, experienced more drift, and more motion sickness in the dark room.

Credit: 2024 Yamaguchi et al., Experience: Practical Challenges for Indoor AR Applications, ACM MobiCom '24

20241122_1_fig_3.png

Fig. 3
Testing conditions: (a) The LiDAR and monocular cameras are blocked to ensure only specific sensors are utilized. Note that the IMU is always utilized. (b) There are three patterns for the complexity of the environment. Wall: a white wall with few visual features; Shelf corner: includes books in a bookshelf providing several visual features, as the red dots show. Crowded: the environment includes miscellaneous clutter to increase the visual features. (c) Different brightness levels. 7 lux is the expected brightness in a movie theater, whereas 200 lux is a well-lit office space. (d) Different movement types. The XY-stage is specially designed to conduct repeatable experiments on the floor plane.

Credit: 2024 Yamaguchi et al., Experience: Practical Challenges for Indoor AR Applications, ACM MobiCom '24

The article, “Experience: Practical Challenges for Indoor AR Applications,” was presented at the 30th Annual International Conference on Mobile Computing and Networking (ACM MobiCom '24) at DOI: https://doi.org/10.1145/3636534.3690676.


Related Links

SDGs

  • 09 Industry, Innocation and Infrastructure