Transforming the future of media with artificial intelligence
With the ability to analyze large datasets to identify patterns and predict outcomes, all at the click of a button, artificial intelligence (AI) is revolutionizing how we live and work. From offering personalized recommendations to automating tedious tasks, AI can help us make better decisions, work smarter and reduce the likelihood of errors.
Chatbots powered by AI, such as ChatGPT, have transformed the media landscape. They can now have human-like conversations, generate content and analyze emotions from text—abilities once thought to be uniquely human.
Given the sheer amount of social media posts and information on the Internet, AI's ability to decode emotions from words could be a game-changer for applications such as sentiment analysis in media monitoring and blocking malicious content.
However, AI is still not as effective as humans at recognizing emotions from text. Understanding emotional tones from written words involves understanding the world and social norms that humans learn through experience, which AI cannot do.
An AI platform, SenticNet, has been devised to address the challenges faced by AI when making sense of human languages. Developed by Prof Erik Cambria from NTU's School of Computer Science and Engineering (SCSE), SenticNet integrates human learning modes with traditional learning approaches that machines use to improve the algorithm's ability to analyze emotions.
SenticNet follows a logical process to infer the sentiments expressed in a sentence by categorizing word meanings in a framework resembling commonsense reasoning. Unlike conventional sentiment analysis models, which are often 'black boxes' that do not provide any insights into their internal reasoning process, the processes by which SenticNet derives its results are transparent, and the results are reproducible and reliable.
"AI systems are becoming less and less transparent, and we hope SenticNet will be able to extract sentiments from text in an explainable manner without compromising performance," said Prof Cambria.
The researchers have demonstrated that combining commonsense reasoning with machine-learning approaches improves performance. When tested, SenticNet outperformed other machine-learning models.
The latest version of SenticNet was reported in the Proceedings of the 13th Conference on Language Resources and Evaluation in 2022.
Prof Cambria is also working on improving SenticNet's ability to encode and decode the meaning behind abstract concepts—a major challenge for AI systems as they do not possess the rich sensory experiences that humans have of the real world.
With moving visuals and sound, videos are an engaging way to convey messages and teach concepts. To allow users to better engage with video content for education and entertainment, a method developed by Assoc Prof Sun Aixin at SCSE makes video content searchable by matching keywords with on-screen images.
Conventional computer vision techniques can do this but are not as efficient when searching for images in long videos.
Assoc Prof Sun and his colleagues developed an algorithm that treats a video as a text passage so that people can search for specific moments in the clip. Using the method, a long video can thus be split into multiple shorter clips for searching.
"This simple and effective strategy enables images in long videos to be searched efficiently, addressing the issue of performance degradation commonly encountered by conventional computer vision techniques when searching long videos," said Assoc Prof Sun.
The findings were published in IEEE Transactions on Pattern Analysis and Machine Intelligence in 2022.
The researchers are currently working to enhance the algorithm's search accuracy and explore its usage on visual content in medical education and surveillance videos.
Detecting fake images
Like any new technology, AI is a double-edged sword. Unfortunately, new threats, such as fake images designed to fool or scam audiences, have surfaced alongside the advancement of AI tools.
For instance, facial manipulation technologies can create photorealistic faces that may be used nefariously to mislead people.
Work fronted by Asst Prof Liu Ziwei at SCSE has resulted in an algorithm called Seq-DeepFake, which flags doctored images by recognizing digital fingerprints left by facial manipulation.
Unlike conventional deep fake detection methods that only predict whether images are real or fake, Seq-DeepFake detects the traces left behind by the manipulation sequentially. The alterations are detected within seconds.
The algorithm can also recover the original face from the manipulated face by reversing the manipulation sequence.
"Seq-DeepFake is a powerful tool that can potentially help everyone from government organizations to individual users verify the authenticity of visual information in the digital age to combat misinformation," said Asst Prof Liu.
In the future, Asst Prof Liu plans to expand the capabilities of Seq-DeepFake to detect other forms of doctored media such as text and videos.