March 8, 2024

Balancing training data and human knowledge to make AI act more like a scientist

When you teach a child how to solve puzzles, you can either let them figure it out through trial and error, or you can guide them with some basic rules and tips. Similarly, incorporating rules and tips into AI training—such as the laws of physics—could make them more efficient and more reflective of the real world. However, helping the AI assess the value of different rules can be a tricky task.

Researchers report March 8 in the journal Nexus that they have developed a framework for assessing the relative value of rules and data in "informed machine learning models" that incorporate both. They showed that by doing so, they could help the AI incorporate basic laws of the real world and better navigate scientific problems like solving complex mathematical problems and optimizing experimental conditions in chemistry experiments.

"Embedding human knowledge into AI models has the potential to improve their efficiency and ability to make inferences, but the question is how to balance the influence of data and knowledge," says first author Hao Xu of Peking University. "Our framework can be employed to evaluate different knowledge and rules to enhance the predictive capability of deep learning models."

Generative AI models like ChatGPT and Sora are purely data-driven—the models are given training data, and they teach themselves via trial and error. However, with only data to work from, these systems have no way to learn physical laws, such as gravity or fluid dynamics, and they also struggle to perform in situations that differ from their training data.

An alternative approach is informed machine learning, in which researchers provide the model with some underlying rules to help guide its training process, but little is known about the relative importance of rules vs. data in driving model accuracy.

"We are trying to teach AI models the laws of physics so that they can be more reflective of the real world, which would make them more useful in science and engineering," says senior author Yuntian Chen of the Eastern Institute of Technology, Ningbo.

To improve the performance of informed machine learning, the team developed a framework to calculate the contribution of an individual rule to a given model's predictive accuracy. The researchers also examined interactions between different rules because most informed machine learning models incorporate multiple rules, and having too many rules can cause models to collapse.

This allowed them to optimize models by tweaking the relative influence of different rules and to filter out redundant or interfering rules entirely. They also identified some rules that worked synergistically and other rules that were completely dependent on the presence of other rules.

"We found that the rules have different kinds of relationships, and we use these relationships to make model training faster and get higher accuracy," says Chen.

The researchers say that their framework has broad practical applications in engineering, physics, and chemistry. In the paper, they demonstrated the method's potential by using it to optimize machine learning models to solve multivariate equations and to predict the results of thin layer chromatography experiments and thereby optimize future experimental chemistry conditions.

Next, the researchers plan to develop their framework into a plugin tool that can be used by AI developers. Ultimately, they also want to train their models so that the models can extract knowledge and rules directly from data, rather than having rules selected by human researchers.

"We want to make it a closed loop by making the model into a real AI scientist," says Chen. "We are working to develop a model that can directly extract knowledge from the data and then use this knowledge to create rules and improve itself."

More information: Worth of Prior Knowledge for Enhancing Deep Learning, Nexus (2024). DOI: 10.1016/j.ynexs.2024.100003. www.cell.com/nexus/fulltext/S2950-1601(24)00001-9

Provided by Cell Press

Citation: Balancing training data and human knowledge to make AI act more like a scientist (2024, March 8) retrieved 17 July 2024 from https://techxplore.com/news/2024-03-human-knowledge-ai-scientist.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

A theoretical model for reliability assessment of machine learning systems

12 shares

Feedback to editors

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

12 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

14 hours ago

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

16 hours ago

Large language models make human-like reasoning mistakes, researchers find

16 hours ago

Unveiling a new class of synthetic fuels

17 hours ago

Microsoft unveils software that allows LLMs to work with spreadsheets

17 hours ago

New technique to assess a general-purpose AI model's reliability before it's deployed

18 hours ago

New system enables intuitive teleoperation of a robotic manipulator in real-time

20 hours ago

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

22 hours ago

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Jul 15, 2024

Load comments (0)

Balancing training data and human knowledge to make AI act more like a scientist

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

A theoretical model for reliability assessment of machine learning systems

Machine learning models can produce reliable results even with limited training data

Exploring how the convergence of automation and AI reshapes organic chemistry research

A machine learning predictor enhances capability for solving intricate physical problems

Machine learning models teach each other to identify molecular properties

An integrated shuffler optimizes the privacy of personal genomic data used for machine learning

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Phys.org

Medical Xpress

Science X

Balancing training data and human knowledge to make AI act more like a scientist

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Related Stories

A theoretical model for reliability assessment of machine learning systems

Machine learning models can produce reliable results even with limited training data

Exploring how the convergence of automation and AI reshapes organic chemistry research

A machine learning predictor enhances capability for solving intricate physical problems

Machine learning models teach each other to identify molecular properties

An integrated shuffler optimizes the privacy of personal genomic data used for machine learning

Recommended for you

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Your Privacy