February 25, 2019

Recognizing disease using less data

by Alexandra George, Carnegie Mellon University, Department of Civil and Environmental Engineering

As artificial intelligence systems learn to better recognize and classify images, they are becoming highly-reliable at diagnosing diseases, such as skin cancer, from medical images. But as good as they are at detecting patterns, AI won't be replacing your doctor any time soon. Even when used as a tool, image recognition systems still require an expert to label the data, and a lot of data at that: it needs images of both healthy patients and sick patients. The algorithm finds patterns in the training data and when it receives new data, it uses what it has learned to identify the new image.

One challenge is that it's time-consuming and costly for an expert to obtain and label each image. To address this issue, a group of researchers from Carnegie Mellon University's College of Engineering, including Professors Hae Young Noh and Asim Smailagic, teamed up to develop an active learning technique that uses a limited data set to achieve a high degree of accuracy in diagnosing diseases like diabetic retinopathy or skin cancer.

The researchers' model begins working with a set of unlabeled images. The model decides how many images to label to have a robust and accurate set of training data. It chooses an initial set of random data to label. Once that data is labelled, it plots that data over a distribution because the images will vary by age, gender, physical property, etc. In order to make a good decision based on this data, the samples need to cover a large distribution space. The system then decides what new data should be added to the dataset, considering the current distribution of data.

"The system measures how optimal this distribution is," said Noh, an associate professor of civil and environmental engineering, "and then computes metrics when a certain set of new data is added to it, and selects the new dataset that maximizes its optimality."

The process is repeated until the set of data has a good enough distribution to be used as the training set. Their method, called MedAL (for medical active learning), achieved 80% accuracy on detecting diabetic retinopathy, using only 425 labeled images, which is a 32% reduction in the number of required labeled examples compared to the standard uncertainty sampling technique, and a 40% reduction compared to random sampling.

They also tested the model on other diseases, including skin cancer and breast cancer images, to show that it could apply to a variety of different medical images. The method is generalizable, since its focus is on how to use data strategically rather than trying to find a specific pattern or feature for a disease. It could also be applied to other problems that use deep learning but have data constraints.

"Our active learning approach combines predictive entropy-based uncertainty sampling and a distance function on a learned feature space to optimize the selection of unlabeled samples," said Smailagic, a research professor in Carnegie Mellon's Engineering Research Accelerator. "The method overcomes the limitations of the traditional approaches by efficiently selecting only the images that provide the most information about the overall data distribution, reducing computation cost and increasing both speed and accuracy."

The team included civil and environmental engineering Ph.D. students Mostafa Mirshekari, Jonathon Fagert, and Susu Xu, and electrical and computer engineering master's students Devesh Walawalkar and Kartik Khandelwal. They presented their findings at the 2018 IEEE International Conference on Machine Learning and Applications in December, where they received a Best Paper Award for their novel work.

More information: Asim Smailagic et al. MedAL: Accurate and Robust Deep Active Learning for Medical Image Analysis, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) (2019). DOI: 10.1109/ICMLA.2018.00078

Provided by Carnegie Mellon University, Department of Civil and Environmental Engineering

Citation: Recognizing disease using less data (2019, February 25) retrieved 4 July 2024 from https://techxplore.com/news/2019-02-disease_1.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Training artificial intelligence with artificial X-rays

26 shares

Feedback to editors

Student designs wearable purifier to protect underground train users and improve air quality

15 minutes ago

Cool roofs outperform green roofs in urban climate modeling study

1 hour ago

Japan deploys humanoid robot for railway maintenance

5 hours ago

Think you're funny? ChatGPT might be funnier

18 hours ago

'Open-washing' generative AI: How Meta, Google and others feign openness

18 hours ago

New open-source software for quantum cryptography is greater than the sum of its parts

21 hours ago

How to increase the rate of plastics recycling

22 hours ago

Lab creates world's first anode-free sodium solid-state battery

23 hours ago

Novel 3D stretchable electronic strip could spark new possibilities for wearable e-textiles

Jul 3, 2024

Meta releases four new publicly available AI models for developer use

Jul 3, 2024

Load comments (0)

Recognizing disease using less data

Student designs wearable purifier to protect underground train users and improve air quality

Cool roofs outperform green roofs in urban climate modeling study

Japan deploys humanoid robot for railway maintenance

Think you're funny? ChatGPT might be funnier

'Open-washing' generative AI: How Meta, Google and others feign openness

New open-source software for quantum cryptography is greater than the sum of its parts

How to increase the rate of plastics recycling

Lab creates world's first anode-free sodium solid-state battery

Novel 3D stretchable electronic strip could spark new possibilities for wearable e-textiles

Meta releases four new publicly available AI models for developer use

Training artificial intelligence with artificial X-rays

New segmentation tool lets medical professionals 'teach' computers to correctly annotate medical images

Faster 3-D imaging could aid diagnosis of cardiovascular, gastrointestinal disease

Study examines use of deep machine learning for detection of diabetic retinopathy

Restoring balance in machine learning datasets

Artificial intelligence system learns to diagnose, classify intracranial hemorrhage

Think you're funny? ChatGPT might be funnier

'Open-washing' generative AI: How Meta, Google and others feign openness

Meta releases four new publicly available AI models for developer use

Study employs image-recognition AI to determine battery composition and conditions

Survey shows most people think LLMs such as ChatGPT can experience feelings and memories

AI is learning from what you said on Reddit, Stack Overflow or Facebook. Are you OK with that?

Phys.org

Medical Xpress

Science X

Recognizing disease using less data

Student designs wearable purifier to protect underground train users and improve air quality

Cool roofs outperform green roofs in urban climate modeling study

Japan deploys humanoid robot for railway maintenance

Think you're funny? ChatGPT might be funnier

'Open-washing' generative AI: How Meta, Google and others feign openness

New open-source software for quantum cryptography is greater than the sum of its parts

How to increase the rate of plastics recycling

Lab creates world's first anode-free sodium solid-state battery

Novel 3D stretchable electronic strip could spark new possibilities for wearable e-textiles

Meta releases four new publicly available AI models for developer use

Related Stories

Training artificial intelligence with artificial X-rays

New segmentation tool lets medical professionals 'teach' computers to correctly annotate medical images

Faster 3-D imaging could aid diagnosis of cardiovascular, gastrointestinal disease

Study examines use of deep machine learning for detection of diabetic retinopathy

Restoring balance in machine learning datasets

Artificial intelligence system learns to diagnose, classify intracranial hemorrhage

Recommended for you

Think you're funny? ChatGPT might be funnier

'Open-washing' generative AI: How Meta, Google and others feign openness

Meta releases four new publicly available AI models for developer use

Study employs image-recognition AI to determine battery composition and conditions

Survey shows most people think LLMs such as ChatGPT can experience feelings and memories

AI is learning from what you said on Reddit, Stack Overflow or Facebook. Are you OK with that?

Your Privacy