August 6, 2016 weblog
Using a deep neural network approach to identify sarcasm
(Tech Xplore)—We easily read the smirks between the lines.
What could possibly go wrong. I just love driving to the airport in rush hour. Don't quit your day job. What shocking news. Yeah, I'm sure he's excited.
We can tell what is sarcastic but AI cannot ferret out sarcasm and that has been a problem.
'Don't quit your day job' might be taken as career counseling but machine learning has a new spot-the-sarcasm friend which can do a better job of understanding if something is sarcastic. Silvio Amir at the University of Lisbon, Portugal, and colleagues have been looking into a Twitter approach.
Edd Gent in New Scientist reports that the team has been working on machine learning and have trained their system to identify sarcasm on Twitter by looking at a user's past tweets.
Gent reported that the system predicts sarcasm with an accuracy of 87 per cent. Gent said that is slightly better than existing approaches.
Accuracy rate aside, a distinguishing feature about their research efforts is that it cuts through looking at a lot of external information. "The key innovation is realizing you can build a model of the user merely based on what they have said in the past," Amir said in the New Scientist article.
Over to their paper on arXiv: Amir and colleagues talk about how their approach stands out from past attempts: "Current methods have achieved this by way of laborious feature engineering. By contrast, we propose to automatically learn and then exploit user embeddings, to be used in concert with lexical signals to recognize sarcasm." They also said their approach did not need "elaborate feature engineering (and concomitant data scraping)."
TechCrunch said their paper described "a method by which the neural network finds the user's 'embeddings'—i.e., contextual cues like the content of previous tweets, related interests and accounts, and so on. It uses these various factors to plot the user with others, and (ideally) finds that they form relatively well-defined groups...."If the sentiment of the tweet seems to disagree with the bulk of what is expressed by similar users, there's a good chance sarcasm is being employed."
Of what use is research focused on AI systems that can cope with sarcasm?
"Mining people's comments on social media is big business," wrote Gent. "Advertisers track people's attitudes and moods, companies and governments follow public opinion."
Devin Coldewey in TechCrunch talked about the benefits. "You can't do accurate sentiment analysis, for instance, if you don't know when someone is kidding around when they say they love or hate something. And knowing the difference between an affirmative "great!" and a sarcastically disappointed one is important for natural language processing."
What's next: TechCrunch said the paper about their work is to be presented at a natural language learning conference.
"Modelling Context with User Embeddings for Sarcasm Detection in Social Media" is authored by Silvio Amir and others from the University of Lisbon and University of Texas at Austin.
Research on sarcasm recognition made headlines earlier on, with a report about scientists working on a system capable of recognizing instances of sarcasm on Twitter.
"Contextualized Sarcasm Detection on Twitter" by David Bamman and Noah Smith described their effort to understand the effect of "extra-linguistic information" on the detection of sarcasm. Who are the speakers? Who is the audience? Context rules. Will Knight at MIT Technology Review wrote that "Previous efforts to automatically recognize sarcasm in text relied entirely on linguistic cues. What's interesting here is that the researchers tried to include some wider context, such as who the author was and what they were tweeting about. And they found it to be noticeably better than existing approaches, correctly guessing 85 percent of the time if a post was sarcastic."
We introduce a deep neural network for automated sarcasm detection. Recent work has emphasized the need for models to capitalize on contextual features, beyond lexical and syntactic cues present in utterances. For example, different speakers will tend to employ sarcasm regarding different subjects and, thus, sarcasm detection models ought to encode such speaker information. Current methods have achieved this by way of laborious feature engineering. By contrast, we propose to automatically learn and then exploit user embeddings, to be used in concert with lexical signals to recognize sarcasm. Our approach does not require elaborate feature engineering (and concomitant data scraping); fitting user embeddings requires only the text from their previous posts. The experimental results show that our model outperforms a state-of-the-art approach leveraging an extensive set of carefully crafted features.
© 2016 Tech Xplore