July 12, 2018 weblog

Research community can go on Facebook AI's NYC conversation tour

by Nancy Owano , Tech Xplore

Jason Weston, with doctorate in machine learning at University of London and Douwe Kiela, with doctorate from the University of Cambridge with thesis on grounding semantics in perceptual modalities, are research scientists at Facebook Research and have introduced the world to their formidable team's Talk the Walk.

Talk the Walk is an eye-opener for scientists interested in doing more for AI as a conversation agent. These days, they do not just gloat over voice assistants telling people when the concert starts or if it will rain. Scientists are exploring goal-directed dialogues.

How easy does that sound? Don't kid yourselves. Trying to get there is hard.

Fast Company turned to Kiela for reasons why the tourist guide effort has research weight. "This task is very important for AI research because it's very hard," Kiela says, "and because it combines all these interesting problems—three-hundred-sixty visual perception, map-based navigation, visual reasoning, and natural language communications via dialogue."

They made the point, first off, that natural language is understandable to most people "without requiring extra steps or knowledge to decipher its meaning." Toward that end, Facebook's AI research group, FAIR, are hooked on a certain strategy for AI to show human-level language understanding.

That strategy, they wrote, "is to train those systems in a more natural way, by tying language to specific environments. Just as babies first learn to name what they can see and touch, this approach—sometimes referred to as embodied AI—favors learning in the context of a system's surroundings, rather than training through large data sets of text (like Wikipedia)."

Enter Talk the Walk. They are teaching AI systems to navigate the streets of New York using language exchanges that sound natural between guide and tourist. Two bots have tasks. The tourist bot wants to navigate its way through 360-degree images of New York City neighborhoods. The guide bot is to help with a map of the neighborhood. The team used MASC (Masked Attention for Spatial Convolution) so that the guide bot could focus on the right place on the map.

They said their goal is "to achieve that high degree of synthetic performance through natural language interaction, and to challenge the community to do the same."

Information for Talk the Walk is on GitHub. "Sharing this work will provide other researchers with a framework to test their own embodied AI systems, particularly with respect to dialogue."

A 360-degree camera captured 5 neighborhoods, Hell's Kitchen, East Village, Financial District, Upper East Side, and Williamsburg in Brooklyn. Daniel Terdiman in Fast Company said the guide bot used a standard 2D map with generic waypoints—"bank," "coffee shop," "deli"—to deliver its instructions on how to navigate.

The AI work involved is about perceiving a certain environment, navigating through it, and communicating about it. Lucas Matney in TechCrunch wrote that "In "Talk the Walk," the guide AI bot had all of this 2D map data and the tourist bot had all of this rich 360 visual data, but it was only through communication with each other that they were able to carry out their directives."

Tourist: Woo I found a Chipotle

Guide: Haha

Tourist: "I'm diagonal from a bank"

Guide: "Cool."

The paper discussing their work can be found on arXiv. It is titled "Talk the Walk: Navigating New York City through Grounded Dialogue," by Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston and Douwe Kiela.

More information: Talk the Walk: Navigating New York City through Grounded Dialogue, arXiv:1807.03367 [cs.AI] arxiv.org/abs/1807.03367

Abstract
We introduce "Talk The Walk", the first large-scale dialogue dataset grounded in action and perception. The task involves two agents (a "guide" and a "tourist") that communicate via natural language in order to achieve a common goal: having the tourist navigate to a given target location. The task and dataset, which are described in detail, are challenging and their full solution is an open problem that we pose to the community. We (i) focus on the task of tourist localization and develop the novel Masked Attention for Spatial Convolutions (MASC) mechanism that allows for grounding tourist utterances into the guide's map, (ii) show it yields significant improvements for both emergent and natural language communication, and (iii) using this method, we establish non-trivial baselines on the full task.

Citation: Research community can go on Facebook AI's NYC conversation tour (2018, July 12) retrieved 1 July 2024 from https://techxplore.com/news/2018-07-facebook-ai-nyc-conversation.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

When little robot will go through your rooms to find the orange purse

24 shares

Feedback to editors

Researchers' robotic system aims to improve autonomy for people with mobility issues

4 hours ago

Computer scientists develop new and improved camera inspired by the human eye

4 hours ago

Thermal energy storage and phase change materials could enhance home occupant safety during extreme weather

7 hours ago

Portable engine can power artificial muscles in assistive devices

8 hours ago

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Load comments (0)

Research community can go on Facebook AI's NYC conversation tour

Researchers' robotic system aims to improve autonomy for people with mobility issues

Computer scientists develop new and improved camera inspired by the human eye

Thermal energy storage and phase change materials could enhance home occupant safety during extreme weather

Portable engine can power artificial muscles in assistive devices

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

When little robot will go through your rooms to find the orange purse

New study finds toddler 'talk time' is a case of follow the leader

Do you speak robot-ish? Interpreters may soon be in the house

The time it takes to learn a new language depends on what you want to do with it

Google website offers new way to discover books and fun way to play with words

Norway fines tourist guide for scaring polar bear

Computer scientists develop new and improved camera inspired by the human eye

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New tool detects AI-generated videos with 93.7% accuracy

Phys.org

Medical Xpress

Science X

Research community can go on Facebook AI's NYC conversation tour

Researchers' robotic system aims to improve autonomy for people with mobility issues

Computer scientists develop new and improved camera inspired by the human eye

Thermal energy storage and phase change materials could enhance home occupant safety during extreme weather

Portable engine can power artificial muscles in assistive devices

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Related Stories

When little robot will go through your rooms to find the orange purse

New study finds toddler 'talk time' is a case of follow the leader

Do you speak robot-ish? Interpreters may soon be in the house

The time it takes to learn a new language depends on what you want to do with it

Google website offers new way to discover books and fun way to play with words

Norway fines tourist guide for scaring polar bear

Recommended for you

Computer scientists develop new and improved camera inspired by the human eye

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New tool detects AI-generated videos with 93.7% accuracy

Your Privacy