September 21, 2023

The Tong test: A new approach to evaluating artificial general intelligence

by Engineering

A recent perspective article published in Engineering proposes a new way for evaluating artificial general intelligence (AGI) with the introduction of the Tong test (where "Tong" corresponds to the pronunciation of the Chinese character of "general," as in "artificial general intelligence"). This innovative approach aims to provide a standardized, quantitative, and objective evaluation system for AGI by focusing on dynamic embodied physical and social interactions (DEPSI).

The rapid advancement of the generative pre-trained transformer (GPT) series has brought AGI to the forefront of the artificial intelligence (AI) field. However, defining and evaluating AGI remained a challenge. The Tong test offers a fresh perspective on AGI evaluation by emphasizing the importance of DEPSI as a framework.

Traditionally, AI benchmarks have been task-oriented, but the Tong test shifts the focus towards ability- and value-oriented evaluations. The virtual platform proposed in the Tong test supports embodied AI in training and testing, enabling AI agents to acquire information, learn, and fine-tune their values and abilities interactively.

The Tong test proposes five critical characteristics that can serve as AGI benchmarks: infinite tasks, self-driven task generation, value alignment, causal understanding, and embodiment. These characteristics form the basis for a systemic evaluation system that allows for the delineation of AGI milestones through a virtual environment with DEPSI.

Unlike classical AI testing systems, the Tong test provides a more comprehensive and inclusive evaluation approach. It combines a general algorithmic testing paradigm with a human–AI interaction-based testing paradigm, taking inspiration from the philosophy of the Turing test. The Tong test's virtual platform generates unlimited tasks with dynamic embodied interaction scenarios, covering various dimensions of abilities and values.

The Tong test platform incorporates essential components such as infrastructure, DEPSI environments, and evaluation tools. This combination provides a practical pathway for building an embodied platform with infinite tasks, where AI algorithms can be evaluated onsite with human interactions.

By introducing the Tong test, this perspective article paves the way for a standardized and objective evaluation system for AGI. It offers theoretical guidance for the development of AI algorithms while emphasizing the importance of DEPSI in evaluating AGI.

The authors of the perspective article believe that the Tong test has the potential to drive the field of AGI evaluation forward by promoting standardized, quantitative, and objective benchmarks. This will not only contribute to the further development of AGI but also foster greater transparency and understanding in the AI community.

More information: Yujia Peng et al, The Tong Test: Evaluating Artificial General Intelligence Through Dynamic Embodied Physical and Social Interactions, Engineering (2023). DOI: 10.1016/j.eng.2023.07.006

Provided by Engineering

Citation: The Tong test: A new approach to evaluating artificial general intelligence (2023, September 21) retrieved 16 August 2024 from https://techxplore.com/news/2023-09-tong-approach-artificial-general-intelligence.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Amazon creates a new user-centric simulation platform to develop embodied AI agents

1 shares

Feedback to editors

Engineers design tiny batteries for powering cell-sized robots

10 hours ago

Leaf-like solar concentrators promise major boost in solar efficiency

11 hours ago

Why does AI beat humans at the strategy game Diplomacy?

11 hours ago

New technique prints metal oxide thin film circuits at room temperature

12 hours ago

Studies highlight challenges and solutions in making large language models trustworthy

13 hours ago

Finding security flaws in Android ahead of malicious hackers

14 hours ago

Robot planning tool accounts for human carelessness

14 hours ago

From shrimp to steel: Introducing nature-inspired metalworking

15 hours ago

'AI Scientist' model designed to conduct scientific research autonomously

16 hours ago

Global AI adoption is outpacing risk understanding, researchers warn

16 hours ago

Load comments (0)

The Tong test: A new approach to evaluating artificial general intelligence

Engineers design tiny batteries for powering cell-sized robots

Leaf-like solar concentrators promise major boost in solar efficiency

Why does AI beat humans at the strategy game Diplomacy?

New technique prints metal oxide thin film circuits at room temperature

Studies highlight challenges and solutions in making large language models trustworthy

Finding security flaws in Android ahead of malicious hackers

Robot planning tool accounts for human carelessness

From shrimp to steel: Introducing nature-inspired metalworking

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Amazon creates a new user-centric simulation platform to develop embodied AI agents

VRKitchen: An interactive virtual environment to train and test AI agents

Do you think you have a penicillin allergy? Chances are, you're wrong

A multisensory simulation platform to train and test home robots

Experts explore the underlying biology of cancer and potential therapeutic strategies

Clinical trial into two potential COVID-19 treatments commences

A two-stage framework to improve LLM-based anomaly detection and reactive planning

Robot planning tool accounts for human carelessness

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Why does AI beat humans at the strategy game Diplomacy?

Studies highlight challenges and solutions in making large language models trustworthy

Phys.org

Medical Xpress

Science X

The Tong test: A new approach to evaluating artificial general intelligence

Engineers design tiny batteries for powering cell-sized robots

Leaf-like solar concentrators promise major boost in solar efficiency

Why does AI beat humans at the strategy game Diplomacy?

New technique prints metal oxide thin film circuits at room temperature

Studies highlight challenges and solutions in making large language models trustworthy

Finding security flaws in Android ahead of malicious hackers

Robot planning tool accounts for human carelessness

From shrimp to steel: Introducing nature-inspired metalworking

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Related Stories

Amazon creates a new user-centric simulation platform to develop embodied AI agents

VRKitchen: An interactive virtual environment to train and test AI agents

Do you think you have a penicillin allergy? Chances are, you're wrong

A multisensory simulation platform to train and test home robots

Experts explore the underlying biology of cancer and potential therapeutic strategies

Clinical trial into two potential COVID-19 treatments commences

Recommended for you

A two-stage framework to improve LLM-based anomaly detection and reactive planning

Robot planning tool accounts for human carelessness

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Why does AI beat humans at the strategy game Diplomacy?

Studies highlight challenges and solutions in making large language models trustworthy

Your Privacy