Artificial intelligence: ARC test focus goes beyond factoid questions

Credit: CC0 Public Domain

"Common sense" is a phrase everyone hears at one time or another, usually from an angry bystander who think you don't have any. What is "common sense?"

"Humans use common sense to fill in the gaps of any question they are posed, delivering answers within an understood but non-explicit context," Swapna Krishna wrote in Engadget.

Add a few years of developmental growth in the young child, and he or she acquires common sense but AI has problems. Calling out the challenge in AI research is Dr. Oren Etzioni, researcher and professor, who leads the Allen Institute for Artificial Intelligence, or AI2, in Seattle, Washington.

To get at the fluidity that people have, their natural ability to move from one thing to the next, the programs need what every ten year old has in spades, he said, and that is called common sense—-a set of facts, heuristics, observations, all the things that we can bring to the table, but the computer does not. "Here at the Allen Institute for Artificial Intelligence, Paul Allen has tasked us with the goal of going after this problem."

They really are. It is now reported that they have come up with a new test as part of their push to imbue AI systems with such an understanding of the world.

The is called ARC, which stands for AI2 Reasoning Challenge. The researchers wrote a paper about their test. "Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge," by Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord.

Will Knight in MIT Technology Review explained that the test "will pose elementary-school-level multiple-choice science questions. Each question will require some understanding of how the world works."

The AI2 site said the questions were assembled to encourage research in advanced question-answering.

Knight quoted Gary Marcus, a professor at NYU. "I think this is a great antidote to the kind of superficial benchmarks that have become so common in the field of ," he said. "It should really force AI researchers to up their game."

The authors in the paper said, "Can your model perform better? We pose ARC as a challenge to the community."

Common sense generally is regarded as the holy grail for artificial intelligence.

The authors in their paper wrote that, "Datasets have become highly influential in driving the direction of research. Recent datasets for QA have led to impressive advances, but have focused on factoid questions where surface-level cues alone are sufficient to find an answer, dis couraging progress on questions requiring reasoning or other advanced methods."

That is where their ARC comes in, to help the field move to more difficult tasks.

"We present a new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering," said the authors in their paper, which is on arXiv.

There are multiple choice . Here's one question: "Which item below is not made from a material grown in nature?" The possible answers are a cotton shirt, a wooden chair, a plastic spoon and a grass basket. The answer taps into a common-sense picture of the world and, said Knight, "It is this common sense that the AI behind voice assistants, chatbots, and translation software lacks. And it's one reason they are so easily confused."

What contribution might this test make to the field of ? "If machine learning can successfully pass the Arc Reasoning Challenge, it would mean that the system has a grasp of the common sense that no AI currently possesses," wrote Krishna. "It would be a huge step forward."

Explore further: Chatbots need to smarten up but easier said than done

More information: Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge (PDF)