December 19, 2023 report

GPT-4 driven robot takes selfies, 'eats' popcorn

by Peter Grad , Tech Xplore

A team of researchers at the University of Tokyo has built a bridge between large language models and robots that promises more humanlike gestures while dispensing with traditional hardware-dependent controls.

Alter3 is the latest version of a humanoid robot first deployed in 2016. Researchers are now using GPT-4 to guide the robot through various simulations, such as taking a selfie, tossing a ball, eating popcorn, and playing air guitar.

Previously, such actions would have required specific coding for each activity, but incorporating GPT-4 introduces broad new capabilities to robots that learn from natural language instruction.

Robots powered by AI "have been primarily focused on facilitating basic communication between life and robots within a computer, utilizing LLMs to interpret and pretend life-like responses," the researchers said in a recent study.

"Direct control is [now] feasible by mapping the linguistic expressions of human actions onto the robot's body through program code," they said. They called the advance "a paradigm shift."

Alter3, which is capable of intricate upper body movement, including detailed facial expressions, has 43 axes simulating human musculoskeletal movement. It rests on a base but cannot walk (although it can mimic walking).

The motion of playing the metal music. This motion is generated by GPT4 with linguistic feedback.

The task of coding the coordination of so many joints was a massive task involving highly repetitive motions.

"Thanks to LLM, we are now free from the iterative labor," the authors said.

Now, they can simply provide verbal instructions describing the desired movements and deliver a prompt instructing the LLM to create Python code that runs the Android engine.

Alter3 retains activities in memory, and researchers can refine and adjust its actions, leading to faster, smoother, and more accurate movements over time.

The authors provide an example of the natural language instructions given to Alter3 for taking a selfie:

Create a big, joyful smile and widen your eyes to show excitement.

Swiftly turn the upper body slightly to the left, adopting a dynamic posture.

Raise the right hand high, simulating a phone.

The motion of pretending the ghost.

Flex the right elbow, bringing the phone closer to the face.

Tilt the head slightly to the right, giving a playful vibe.

Utilizing LLMs in robotics research "redefines the boundaries of human-robot collaboration, paving the way for more intelligent, adaptable, and personable robotic entities," the researchers said.

They injected a little humor into Alter3's activities. In one scenario, the robot pretends to consume a bag of popcorn only to learn it belongs to the person sitting next to it. Exaggerated facial expressions and arm gestures convey surprise and embarrassment.

The camera-equipped Alter3 can "see" humans. Researchers found that Alter3 can refine its behavior by observing human responses. They compared such learning to neonatal imitation, which child behaviorists observe in newborns.

The "zero-shot" learning capacity of GPT-4 connected robots "holds the potential to redefine the boundaries of human-robot collaboration, paving the way for more intelligent, adaptable, and personable robotic entities," the researchers said.

The paper, "From Text to Motion: Grounding GPT-4 in a Humanoid Robot 'Alter3'," written by Takahide Yoshida, Atsushi Masumori and Takashi Ikegami, is available to the preprint server arXiv.

More information: Takahide Yoshida et al, From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3", arXiv (2023). DOI: 10.48550/arxiv.2312.06571

Project page: tnoinkwms.github.io/ALTER-LLM/

Journal information: arXiv

Citation: GPT-4 driven robot takes selfies, 'eats' popcorn (2023, December 19) retrieved 30 June 2024 from https://techxplore.com/news/2023-12-gpt-driven-robot-selfies-popcorn.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Using large language models to code new tasks for robots

134 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (2)

GPT-4 driven robot takes selfies, 'eats' popcorn

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Using large language models to code new tasks for robots

An approach that allows robots to learn in changing environments from human feedback and exploration

Creation of training data to estimate the states of care robot users

Ameca robot shows off new level of human-like facial expressions

An embodied conversational agent that merges large language models and domain-specific assistance

Robot helps students with learning disabilities stay focused

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

Phys.org

Medical Xpress

Science X

GPT-4 driven robot takes selfies, 'eats' popcorn

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Using large language models to code new tasks for robots

An approach that allows robots to learn in changing environments from human feedback and exploration

Creation of training data to estimate the states of care robot users

Ameca robot shows off new level of human-like facial expressions

An embodied conversational agent that merges large language models and domain-specific assistance

Robot helps students with learning disabilities stay focused

Recommended for you

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

Your Privacy