April 29, 2021
Siri, meet grandma: Building a voice assistant for older adults
Computer scientists at the University of California San Diego received an Amazon Research Award to develop a voice assistant to better communicate with older adults. Their initial goal is to create a system capable of understanding and answering the medical questions of adults over age 65.
The way the elderly sometimes struggle to use voice assistants like Alexa or Siri may often be joked about, but the problem is real: data show that adults over 65—a demographic expected to double between 2010 and 2050—tend to give up on using these tools after several unsuccessful attempts at getting their questions answered.
The problem is that existing natural language processing (NLP) systems—the artificial intelligence models used to train computers to understand spoken and written human language—are trained to understand short, formal questions. This is fine for people who grew up with computer technology and know how to phrase their questions to be understood by the device. But older adults are used to speaking to people, not machines, and often struggle to pare down longer or conversational questions to a sentence the artificial intelligence underpinning the voice assistants can understand.
"Our job is to bridge that gap," said Khalil Mrini, the UC San Diego computer science Ph.D. student who will be supported by the Amazon Research Award to work on the project. "We're working on technology that will flip the burden: instead of having older adults change or reformulate their question, we want the AI to be able to learn and understand the older adult."
To accomplish this, Mrini, a graduate student in UC San Diego computer science Professor Ndapa Nakashole's lab, is working to create an end-to-end question answering system that can: 1) shorten the user's question down to its key components, 2) match the shortened medical question to a frequently asked question from a database of 17,000 medical questions sourced from the National Institutes of Health; and 3) select the relevant portions of the corresponding longer answer to share with the user.
Mrini has already completed the first step—training the AI to summarize a long question into a shorter one—which he did by training the NLP model jointly on summarizing questions and a classification task called question entailment, in which the model learns to tell if by answering a short question, the initial longer question has been addressed.
"We found that if you're training on both summarization—which is basically feeding longer user questions and the model learns to generate the short question—and at the same time training on question entailment, that classification task, then you're able to generate better results," said Mrini.
This question summarization AI model can be implemented as a standalone feature into existing voice assistants like Alexa, or integrated as a first step into any question answering system, where it can shorten user questions.
The team is still working on getting the tool to match the question with one in a pool of FAQs—in this case provided by the NIH—and then have the AI select the relevant pieces of a longer answer to read back to the user.
"The potential impact of this work is substantial because it aims to broaden the population of people who can benefit from conversational agents, by targeting an important segment of the population, older adults," said Nakashole.
Making AI more inclusive
Why did this segment of the population need an add-on feature after the fact? Why were their user needs not considered in initial voice assistant testing? It's a problem Mrini says is pervasive in AI systems and training, and one that he and researchers in the VOLI team are trying to rectify.
"Among users of existing question answering systems, older adults are underrepresented," he said. "It stems from a lack of representation among internet users, which escalates into not having a lot of data about this particular demographic and so on. You have a problem that escalates into a lack of data, and if there's no data, you can't train an algorithm to meet their specific needs."
Mrini is intimately familiar with this problem. As a native of Morocco, he noticed that virtually no AI models worked for his native language, Darija, a form of Arabic spoken in Morocco.
"I realized there were—and still are—close to no NLP models or AI models that work well for my native language. So my first ever project was making data so that AI can be trained for my native language," Mrini said.
He's interned at Adobe Research and the Alexa group at Amazon, working to make AI models that can predict the syntax of a sentence in an interpretable way, and summarize the important takeaways of longer articles, respectively.
"The global goal of my research is to make human language technology more interpretable and more accessible to wider audiences," Mrini said. "Now I'm mainly working on English-language technology, but to make it more accessible for marginalized or underrepresented audiences, so older adults in this case."