August 30, 2021

Researchers offer standards for studies using machine learning

Researchers in the life sciences who use machine learning for their studies should adopt standards that allow other researchers to reproduce their results, according to a comment article published today in the journal Nature Methods.

The authors explain that the standards are key to advancing scientific breakthroughs, making advances in knowledge, and ensuring research findings are reproducible from one group of scientists to the next. The standards would allow other groups of scientists to focus on the next breakthrough rather than spending time recreating the wheel built by the authors of the original study.

Casey S. Greene, Ph.D., director of the University of Colorado School of Medicine's Center for Health AI, is a corresponding author of the article, which he co-authored with first author Benjamin J. Heil, a member of Greene's research team, and researchers from the United States, Canada, and Europe.

"Ultimately all science requires trust—no scientist can reproduce the results from every paper they read," Greene and his co-authors write. "The question, then, is how to ensure that machine-learning analyses in the life sciences can be trusted."

Greene and his co-authors outline standards to qualify for one of three levels of accessibility: Bronze, silver, and gold. These standards each set minimum levels for sharing study materials so that other life science researchers can trust the work, and if warranted, validate the work and build on it.

To qualify for a bronze standard, life science researchers would need to make their data, code, and models publicly available. In machine learning, computers learn from training data and having access to that data enables scientists to look for problems that can confound the process. The code tells future researchers how the computer was told to carry out the steps of the work.

In machine learning, the resulting model is critically important. For future researchers, knowing the original research team's model is critical for understanding how it relates to the data it is supposed to analyze. Without access to the model, other researchers cannot determine biases that might influence the work. For example, it can be difficult to determine whether an algorithm favors one group of people over another.

"Being unable to examine a model also makes trusting it difficult," the authors write.

The silver standard calls for the data, models, and code provided at the bronze level, and adds more information about the system in which to run the code. For the next scientists, that information makes it theoretically possible that they could duplicate the training process.

To qualify for the gold standard, researchers must add an "easy button" to their work to make it possible for future researchers to reproduce the previous analysis with a single command. The original researchers must automate all steps of their analysis so that "the burden of reproducing their work is as small as possible." For the next scientists, this information makes it practically possible to duplicate the training process and either adapt or extend it.

Greene and his co-authors also offer recommendations for documenting the steps and sharing them.

The Nature Methods article is an important contribution to the continuing refinement of the use of machine learning and other data-analysis methods in health sciences and other fields where trust is particularly important. Greene is one of several leaders recently recruited by the CU School of Medicine to establish a program in developing and applying robust data science methodologies to advance biomedical research, education, and clinical care.

More information: Benjamin J. Heil et al, Reproducibility standards for machine learning in the life sciences, Nature Methods (2021). DOI: 10.1038/s41592-021-01256-7

Journal information: Nature Methods

Provided by CU Anschutz Medical Campus

Citation: Researchers offer standards for studies using machine learning (2021, August 30) retrieved 29 June 2024 from https://techxplore.com/news/2021-08-standards-machine.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

How hackers can 'poison' open-source code

208 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

23 hours ago

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Researchers offer standards for studies using machine learning

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

How hackers can 'poison' open-source code

Report proposes standards for sharing data and code used in computational studies

Trust the machine—it knows what it is doing

Machine learning aids in simulating dynamics of interacting atoms

Machine learning models for diagnosing COVID-19 are not yet suitable for clinical use: study

Researchers build models using machine learning technique to enhance predictions of COVID-19 outcomes

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Phys.org

Medical Xpress

Science X

Researchers offer standards for studies using machine learning

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

How hackers can 'poison' open-source code

Report proposes standards for sharing data and code used in computational studies

Trust the machine—it knows what it is doing

Machine learning aids in simulating dynamics of interacting atoms

Machine learning models for diagnosing COVID-19 are not yet suitable for clinical use: study

Researchers build models using machine learning technique to enhance predictions of COVID-19 outcomes

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Your Privacy