August 14, 2014 weblog
Move over, Turing Test. Winograd Schema Challenge in town
Isn't there something better than the Turing test to measure computer intelligence? Is the Turing Test the best we have to judge a machine's capability to produce behavior that requires human thought? Doubt was expressed by many when it was announced that the program Eugene Goostman had fooled 33 percent of judges into thinking the chatbot was a human after five minutes of questioning. Anders Sandberg, a University of Oxford research fellow, said in The Conversation that "Eugene's success in the Turing test may tell us more about how weak we humans are when it comes to detecting intelligence and agency in conversation than about how smart our machines are." Gary Marcus, a professor of cognitive science at New York University, said that "It turns out that you can do a lot of misdirection, answer sarcastically, and evade the fact that you are a computer. So all it really shows is you can fool humans for a short period of time, about five minutes - not all of the humans, but maybe more than you might've expected - by having these sort of personality twitches.
Now there is a clear replacement effort, called the Winograd Schema Challenge. This is to be a yearly event designed to judge if a computer program truly models human intelligence. The deadline is October 1, 2015, where $25,000 will be awarded to the program that passes the test.
Who or what is the "Winograd" in the contest title? According to I Programmer, the test elaborates ideas from Terry Winograd, known for developing an AI-based framework for understanding natural language. This Winograd test was developed by Hector Levesque, a professor of computer science at the University of Toronto, who won the 2013 IJCAI (International Joint Conference on Artificial Intelligence) Award for Research Excellence.
Sponsors are Nuance Communications, a voice and language solutions company, in cooperation with Commonsense Reasoning; the latter, as its title suggests, is a research group focused on research in commonsense reasoning, and they will administer and evaluate the Winograd Schema Challenge. Contest details are on their site. "Rather than base the test on the sort of short free-form conversation suggested by the Turing Test," said the site posting, "the Winograd Schema Challenge (WSC) poses a set of multiple-choice questions that have a particular form." Sample questions are provided.
Charles Ortiz, research scientist at Nuance, said the benefits of such a challenge "can help guide more systematic research efforts that will, in the process, allow us to realize new systems that push the boundaries of current AI capabilities and lead to smarter personal assistants and intelligent systems."
© 2014 Tech Xplore