October 27, 2023 feature

Using large language models to enable open-world, interactive and personalized robot navigation

by Ingrid Fadelli , Tech Xplore

Robots should ideally interact with users and objects in their surroundings in flexible ways, rather than always sticking to the same sets of responses and actions. A robotics approach aimed towards this goal that recently gained significant research attention is zero-shot object navigation (ZSON).

ZSON entails the development of advanced computational techniques that allow robotic agents to navigate unknown environments interacting with previously unseen objects and responding to a wide range of prompts. While some of these techniques achieved promising results, they often only allow robots to locate generic classes of objects, rather than using natural language processing to understand a user's prompt and locate specific objects.

A team of researchers at University of Michigan recently set out to develop a new approach that would enhance the ability of robots to explore open-world environments and navigate them in personalized ways. Their proposed framework, introduced in a paper published on arXiv preprint server, uses large language models (LLMs) to allow robots to better respond to requests made by users, for instance locating specific nearby objects.

"The existing works of ZSON mainly focus on following individual instructions to find generic object classes, neglecting the utilization of natural language interaction and the complexities of identifying user-specific objects," Yinpei Dai, Run Peng and their colleagues wrote in their paper. "To address these limitations, we introduce Zero-shot Interactive Personalized Object Navigation (ZIPON), where robots need to navigate to personalized goal objects while engaging in conversations with users."

In their paper, Dai, Peng and their collaborators firstly introduce a new task, which they dub ZIPON. This task is a generalized form of ZSON, that entails accurately responding to personalized prompts and locating specific target objects.

If traditional ZSON entails locating a nearby bed or chair, ZIPON takes this one step further, asking a robot to identify a specific person's bed, a chair bought from Amazon, and so on. The researchers subsequently tried to develop a computational framework that would effectively solve this ask.

"To solve ZIPON, we propose a new framework termed Open-woRld Interactive persOnalized Navigation (ORION), which uses Large Language Models (LLMs) to make sequential decisions to manipulate different modules for perception, navigation and communication," Dai, Peng and their colleagues wrote in their paper.

The new framework developed by this team of researchers has six key modules: a control, a semantic map, an open-vocabulary detection, an exploration, a memory, and an interaction module. The control module allows the robot to move around in its surroundings, the semantic map module indexes natural language, and the open-vocabulary detection module allows the robot to detect objects based on language-based descriptions.

Robots then search for objects in their surrounding environment using the exploration module, while storing important information and feedback received from users in the memory module. Finally, the interaction module allows robots to speak with users, verbally responding to their requests.

Dai, Peng and their colleagues evaluated their proposed framework both in simulations and real-world experiments, using TIAGo, a mobile wheeled robot with two arms. Their findings were promising, as their framework successfully improved the ability of the robot to utilize user feedback when trying to locate specific nearby objects.

"Experimental results show that the performance of interactive agents that can leverage user feedback exhibits significant improvement," Dai, Peng and their colleagues explained. "However, obtaining a good balance between task completion and the efficiency of navigation and interaction remains challenging for all methods. We further provide more findings on the impact of diverse user feedback forms on the agents' performance."

While the ORION framework shows potential for improving personalized robot navigation of unknown environments, the team found simultaneously ensuring that robots complete missions, smoothly navigate unknown environments and interact well with users extremely challenging. In the future, this study could inform the development of new models for completing the ZIPON task, which could address some of the reported shortcomings of the team's proposed framework.

"This work is only our initial step in exploring LLMs in personalized navigation and has several limitations," Dai, Peng and their colleagues wrote in their paper. "For example, it does not handle broader goal types, such as image goals, or address multi-modal interactions with users in the real world. Our future efforts will expand on these dimensions to advance the adaptability and versatility of interactive robots in the human world."

More information: Yinpei Dai et al, Think, Act, and Ask: Open-World Interactive Personalized Robot Navigation, arXiv (2023). DOI: 10.48550/arxiv.2310.07968. arxiv.org/abs/2310.07968

Journal information: arXiv

Citation: Using large language models to enable open-world, interactive and personalized robot navigation (2023, October 27) retrieved 29 June 2024 from https://techxplore.com/news/2023-10-large-language-enable-open-world-interactive.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

A framework for risk-aware robot navigation in unknown environments

53 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Using large language models to enable open-world, interactive and personalized robot navigation

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

A framework for risk-aware robot navigation in unknown environments

Amazon creates a new user-centric simulation platform to develop embodied AI agents

Teaching robots to tidy up based on user preferences using large language models

An embodied conversational agent that merges large language models and domain-specific assistance

A new approach to improve robot navigation in crowded environments

Novel learning framework allows robots to perform interactive tasks in sequential order

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

Phys.org

Medical Xpress

Science X

Using large language models to enable open-world, interactive and personalized robot navigation

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

A framework for risk-aware robot navigation in unknown environments

Amazon creates a new user-centric simulation platform to develop embodied AI agents

Teaching robots to tidy up based on user preferences using large language models

An embodied conversational agent that merges large language models and domain-specific assistance

A new approach to improve robot navigation in crowded environments

Novel learning framework allows robots to perform interactive tasks in sequential order

Recommended for you

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

Your Privacy