Diagram showing the components of the researchers’ mission planning framework. Credit: Delmerico et al.

Over the past decades, engineers have created devices with increasingly advanced functions and capabilities. A device capability that was substantially improved in recent years is known as "spatial computing."

The term spatial computing essentially refers to the ability of computers, robots, and other to be "aware" of their surrounding environment and to create digital representations of it. Cutting edge technologies, such as sensors and (MR), can significantly enhance spatial computing, enabling the creation of sophisticated sensing and mapping systems.

Researchers at the Microsoft Mixed Reality and AI Lab and ETH Zurich have recently developed a new framework that combines MR and robotics to enhance spatial computing applications. They implemented and tested this framework, introduced in a paper pre-published on arXiv, on a series of systems for human-robot interaction.

"The combination of spatial computing and egocentric sensing on mixed reality devices enables them to capture and understand human actions and translate these to actions with spatial meaning, which offers exciting new possibilities for collaboration between humans and robots," the researchers wrote in their paper. "This paper presents several human-robot systems that utilize these capabilities to enable novel robot use cases: mission planning for inspection, gesture-based control, and immersive teleoperation."

A user’s view of a spatial mesh, captured using HoloLens and overlaid on the real world. Credit: Delmerico et al.

The MR and robotics-based framework devised by this team of researchers was implemented on three different systems with different functions. Notably, all these systems require the use of a HoloLens MR headset.

The first system is designed to plan robot missions that entail inspecting a given environment. Essentially, a human user moves in the environment he/she wishes to inspect wearing a HoloLens headset, placing holograms shaped as waypoints that define a robot's trajectory. In addition, the user can highlight specific areas where it wants a robot to collect images or data. This information is processed and translated, so that it can subsequently be used to guide a robot's movements and actions as it is inspecting the environment.

The second system proposed by the researchers is an interface that allows human users to interact with the robot more effectively, for instance, controlling the robot's movements using hand gestures. In addition, this system enables the colocalization of different devices, including MR headsets, smartphones, and robots.

"Colocalization of devices requires that they are each able to localize themselves to a common reference coordinate system," the researchers wrote. "Through their individual poses with respect to this common coordinate frame, the relative transformation between localized devices can be computed, and subsequently used to enable new behaviors and collaboration between devices."

The first system created by the team converts the HoloLens map (above) into a 2D occupancy grid representation, with a coordinate frame aligned with that of the mesh, to enable robot localization with LiDAR. Credit: Delmerico et al.

To colocalize devices, the team introduced a framework that ensures that all devices in their systems share their positions relative to each other and a common reference map. In addition, users can use the HoloLens headset to give navigation instructions to robots, simply by performing a series of intuitive hand gestures.

Finally, the third system enables immersive teleoperation, which means that a user could remotely control a robot while viewing its surrounding environment. This system could be particularly valuable in instances where a robot will be required to navigate an environment that is inaccessible to humans.

"We explore the projection of a user's actions to a remote robot and the robot's sense of space back to the user," the researchers explained. "We consider several levels of immersion, based on touching and manipulating a model of the robot to control it, and the higher-level immersion of becoming the robot and mapping the user's motion directly to the ."

In initial tests, the three systems proposed by Jeffrey Delmerico and his colleagues at Microsoft achieved highly promising results, highlighting the potential of using MR to enhance both spatial computing and . In the future, these systems could be introduced in many different settings, allowing humans to closely collaborate with robots to efficiently solve a wider range of complex real-world problems.

More information: Spatial computing and intuitive interaction: bringing mixed reality and robotics together. arXiv:2202.01493 [cs.RO]. arxiv.org/abs/2202.01493