July 13, 2023

Google AI health chatbot passes US medical exam: study

Bard — Credit: Unsplash/CC0 Public Domain

Google's artificial intelligence-powered medical chatbot has achieved a passing grade on a tough US medical licensing exam, but it's answers still fall short of those from human doctors, a peer-reviewed study said on Wednesday.

Last year the release of ChatGPT—whose developer OpenAI is backed by Google's rival Microsoft—kicked off a race between tech giants in the burgeoning field of AI.

While much has been made about the future possibilities—and dangers—of AI, health is one area where the technology had already shown tangible progress, with algorithms able to read certain medical scans as well as humans.

Google first unveiled its AI tool for answering medical questions, called Med-PaLM, in a preprint study in December. Unlike ChatGPT, it has not been released to the public.

The US tech giant says Med-PaLM is the first large language model, an AI technique trained on vast amounts of human-produced text, to pass the US Medical Licensing Examination (USMLE).

A passing grade for the exam, which is taken by medical students and physicians-in-training in the United States, is around 60 percent.

In February, a study said that ChatGPT had achieved passing or near passing results.

In a peer-reviewed study published in the journal Nature on Wednesday, Google researchers said that Med-PaLM had achieved 67.6 percent on USMLE-style multiple choice questions.

"Med-PaLM performs encouragingly, but remains inferior to clinicians," the study said.

To identify and cut down on "hallucinations"—the name for when AI models offer up false information—Google said it had developed a new evaluation benchmark.

Karan Singhal, a Google researcher and lead author of the new study, told AFP that the team has used the benchmark to test a newer version of their model with "super exciting" results.

Med-PaLM 2 has reached 86.5 percent on the USMLE exam, topping the previous version by nearly 20 percent, according to a preprint study released in May that has not been peer-reviewed.

'Elephant in the room'

James Davenport, a computer scientist at the UK's University of Bath not involved in the research, said "there is an elephant in the room" for these AI-powered medical chatbots.

There is a big difference between answering "medical questions and actual medicine," which includes diagnosing and treating genuine health problems," he said.

Anthony Cohn, an AI expert at the UK's Leeds University, said that hallucinations would likely always be a problem for such large language models, because of their statistical nature.

Therefore these models "should always be regarded as assistants rather than the final decision makers," Cohn said.

Singhal said that in the future Med-PaLM could be used to support doctors to offer up alternatives that may not have been considered otherwise.

The Wall Street Journal reported earlier this week that Med-PaLM 2 has been in testing at the prestigious US Mayo Clinic research hospital since April.

Singhal said he could not speak about specific partnerships.

But he emphasized that any testing would not be "clinical, or patient facing, or are able to cause patients harm".

It would instead be for "more administrative tasks that can be relatively easily automated, with low stakes," he added.

Citation: Google AI health chatbot passes US medical exam: study (2023, July 13) retrieved 16 August 2024 from https://techxplore.com/news/2023-07-google-ai-health-chatbot-medical.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

ChatGPT takes on the tough US medical licensing exam

10 shares

Feedback to editors

Engineers design tiny batteries for powering cell-sized robots

10 hours ago

Leaf-like solar concentrators promise major boost in solar efficiency

11 hours ago

Why does AI beat humans at the strategy game Diplomacy?

11 hours ago

New technique prints metal oxide thin film circuits at room temperature

12 hours ago

Studies highlight challenges and solutions in making large language models trustworthy

13 hours ago

Finding security flaws in Android ahead of malicious hackers

14 hours ago

Robot planning tool accounts for human carelessness

14 hours ago

From shrimp to steel: Introducing nature-inspired metalworking

15 hours ago

'AI Scientist' model designed to conduct scientific research autonomously

16 hours ago

Global AI adoption is outpacing risk understanding, researchers warn

16 hours ago

Load comments (0)

Google AI health chatbot passes US medical exam: study

'Elephant in the room'

Engineers design tiny batteries for powering cell-sized robots

Leaf-like solar concentrators promise major boost in solar efficiency

Why does AI beat humans at the strategy game Diplomacy?

New technique prints metal oxide thin film circuits at room temperature

Studies highlight challenges and solutions in making large language models trustworthy

Finding security flaws in Android ahead of malicious hackers

Robot planning tool accounts for human carelessness

From shrimp to steel: Introducing nature-inspired metalworking

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

ChatGPT takes on the tough US medical licensing exam

ChatGPT can (almost) pass the US Medical Licensing Exam

ChatGPT found to be capable of passing exams for MBA and Medical Licensing Exam

ChatGPT flunks self-assessment test for urologists

Google launches ChatGPT rival in US and UK

Interview: How does ChatGPT perform on the United States Medical Licensing Examination?

A two-stage framework to improve LLM-based anomaly detection and reactive planning

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Why does AI beat humans at the strategy game Diplomacy?

Studies highlight challenges and solutions in making large language models trustworthy

How working with AI impacts the collective attention of teams

Phys.org

Medical Xpress

Science X

Google AI health chatbot passes US medical exam: study

'Elephant in the room'

Engineers design tiny batteries for powering cell-sized robots

Leaf-like solar concentrators promise major boost in solar efficiency

Why does AI beat humans at the strategy game Diplomacy?

New technique prints metal oxide thin film circuits at room temperature

Studies highlight challenges and solutions in making large language models trustworthy

Finding security flaws in Android ahead of malicious hackers

Robot planning tool accounts for human carelessness

From shrimp to steel: Introducing nature-inspired metalworking

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Related Stories

ChatGPT takes on the tough US medical licensing exam

ChatGPT can (almost) pass the US Medical Licensing Exam

ChatGPT found to be capable of passing exams for MBA and Medical Licensing Exam

ChatGPT flunks self-assessment test for urologists

Google launches ChatGPT rival in US and UK

Interview: How does ChatGPT perform on the United States Medical Licensing Examination?

Recommended for you

A two-stage framework to improve LLM-based anomaly detection and reactive planning

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Why does AI beat humans at the strategy game Diplomacy?

Studies highlight challenges and solutions in making large language models trustworthy

How working with AI impacts the collective attention of teams

Your Privacy