December 8, 2019 weblog
Hanabi: Facebook AI steps up to cooperative gameplay
The card game Hanabi has been taken on as a challenge by Facebook's AI, and it is quite a challenge considering they are entering a realm where playing is not just a question of one opponent beating another, but a "cooperative" card game where a competing team helps one another.
Jonathan Vanian, Fortune, walked readers through the game as a means of introduction:
"...:teams of two to five players are given random cards of different colors and numbers that represent points. The goal for teams is to lay the cards on a table, grouped by color, in the correct numerical order. The problem, however, is that players cannot see their own cards while their teammates can. A player can give hints to another, like making a remark about a certain color, that would tip the other off to do something like playing or discarding a card. The dilemma is that the player must deduce what their teammate's clue means."
Corporate leaders have shown a leaning toward Hanabi as a team-building learning experience; it now has won the attention of AI researchers thinking about building standout AI systems.
"Getting near perfect scores on an obscure French card game is great and all but Facebook has bigger plans for its cooperative AI," said Engadget.
Facebook researcher, Tom Lerer, was quoted in Engadget: "What we're looking at is artificial agents that can reason better about cooperative interactions with humans and chatbots that can reason about why the person they're chatting with said the thing they did...Chatbots that can reason better about why people say the things they do without having to enumerate every detail of what they're asking for is a very straightforward application of this type of search technique."
What AI strategies did the researchers put to work?
Vanian identified a search technique previously used by DeepMind; it let multiple Hanabi bots evaluate multiple playing options while sharing information with each other. Combined with reinforcement learning, the Facebook bots learned how to play Hanabi with one another.
The authors behind this investigation wrote a paper discussing their work and the paper is on arXiv (published in the Artificial Intelligence journal). "The Hanabi Challenge: A New Frontier for AI Research" is the title of the paper, and authors said they took on Hanabi as a "challenge domain with novel problems that arise from its combination of purely cooperative gameplay and imperfect information in a two to five player setting."
The authors remarked that it is best described as a type of team Solitaire and the game's imperfect information arises from each player being unable to see their own cards (the ones they hold and can act on), each of which has a color and rank.
For reproducible research results, the authors released an open source Hanabi RL environment called the Hanabi Learning Environment written in Python and C++.
Elsewhere but relevant to their goals in reproducible research, Jerome Pesenti, vice president AI at Facebook, was in a recent Q&A with Will Knight in Wired.
Knight asked Pesenti about recreating groundbreaking research.
"It's something that Facebook AI is very passionate about," said Pesenti. "When people do things that are not reproducible, it creates a lot of challenges. If you cannot reproduce it, it's a lot of lost investment... The beauty of AI is that it is ultimately systems run by computers. So it is a prime candidate, as a subfield of science, to be reproducible. We believe the future of AI will be something where it's reproducible almost by default. We try to open source most of the code we are producing in AI, so that other people can build on top of it."
The authors, in their paper, have a section with the crosshead "Hanabi: The Benchmark."
This research effort is about using Hanabi as a challenging benchmark problem for AI. Unique properties distinguish it from other benchmarks. "It is a multi-agent learning problem, unlike, for example, the Arcade Learning Environment. It is also an imperfect information game, where players have asymmetric knowledge about the environment state, which makes the game more like poker than chess, backgammon, or Go."
Andrew Tarantola in Engadget picked up on this point. Life in the real world isn't a zero sum game like poker or Starcraft, he said, "and we need AI to work with us, not against us."
Two Engadget reader comments did not show awe of what has been achieved thus far. "Pretty sure having knowledge of how humans usually play a single card game and general knowledge of human intentions are two very different things," said one. Another said that "identifying patterns of action is a far cry from theory of mind...You could argue if they are attempting to attribute theory of mind, their accuracy needs work."
More information: Nolan Bard et al. The Hanabi challenge: A new frontier for AI research, Artificial Intelligence (2019). DOI: 10.1016/j.artint.2019.103216
The Hanabi Challenge: A New Frontier for AI Research, arXiv:1902.00506 [cs.LG] arxiv.org/abs/1902.00506
© 2019 Science X Network