The complexity of artificial intelligence
Artificial Intelligence, or AI, makes us look better in selfies, obediently tells us the weather when we ask Alexa for it, and rolls out self-drive cars. It is the technology that enables machines to learn from experience and perform human-like tasks.
As a whole, AI contains many subfields, including natural language processing, computer vision, and deep learning. Most of the time, the specific technology at work is machine learning, which focuses on the development of algorithms that analyzes data and makes predictions, and relies heavily on human supervision.
SMU Assistant Professor of Information Systems, Sun Qianru, likens training a small-scale AI model to teaching a young kid to recognize objects in his surroundings. "At first a kid doesn't understand many things around him. He might see an apple but doesn't recognize it as an apple and he might ask, "Is this a banana?" His parents will correct him, "No, this is not a banana. This is an apple." Such feedback in his brain then signals to fine-tune his knowledge."
Professor Sun's research focuses on deep convolutional neural networks, meta-learning, incremental learning, semi-supervised learning, and their applications in recognizing images and videos.
Training an AI model
Because of the complexity of AI, Professor Sun ventures into general concepts and current trends in the field before diving into her research projects.
She explains that supervised machine learning involves models training itself on a labeled data set. That is, the data is labeled with information that the model is being built to determine, and that which may even be classified in ways the model is supposed to classify as data. For example, a computer vision model designed to identify an apple might be trained on a data set of various labeled apple images.
"Give it data, and the data has labels," she explains. "An image could contain an apple, and the image goes through the deep AI model and makes some predictions. If the prediction is right, then it's fine. Otherwise, the model will get computational loss or penalty to backpropagate through to modify its parameters. And so the model gets updated."
Currently, the state-of-the-art or best performing AI models are almost all based on deep learning models, Professor Sun observes. In deep learning, the model learns to perform recognition tasks from images, text, or sound based on the deep neural network architectures that contain many layers. If the input is an image, for example, the assumption is the image can be described by different spatial scales or layers of features.
Professor Sun illustrates: "Take my face, for example. The features that distinguish me from other people are my eyes, my nose, my mouth as local features, and my face shape and skin color as global features. For identification, I can use these features to say, "This is me.'" For a machine model, it encodes such local and global features in its different layers and thus can do the same identification.
Training AI models require a lot of data for accurate recognition. If an AI model has only one image of a person's face, it makes mistakes recognizing that person because it does not see the other facial features that distinguishes that person from those of another, she argues. "Appearances have differences and AI depends on a highly diverse data set in order to learn all the differences of the image."
Health Promotion Board app
One of the projects that Professor Sun is working on is Food AI++, an app for Singapore's Health Promotion Board (HPB). Users are able to determine food composition data simply by taking pictures of the food they are eating with their phones. The aim of the app is to help users track nutrition of the food they consume and use the information to achieve a healthy, well-balanced diet.
Professor Sun and her team collect data of the images that users take of their meals and upload them to the app. The observation is that food images are very noisy and diverse, reflecting different cultures.
"Chinese and Malays in Singapore, for example, have different eating habits, food styles, and different categories of food," she clarifies. "When we train a model, we begin with a limited list of categories, but for the food app we found that we had to expand the categories all the time in the Application Programming Interface, or API. We have to constantly modify and update the data set. The rich cultural diversity in Singapore is one of the biggest challenges in this project."
Besides collecting more diverse data, the team is also working on domain adaptation learning algorithms. With different cultures, there are different domains so they have to think about how to quickly adapt their pre-trained models to them by leveraging effective learning algorithms. To do this for food images, they need to develop food-specific domain adaptation algorithms. They also need to think about including food knowledge to improve the overall efficiency of multi-domain models.
"We want to do this adaptation by using a small data set in the new domain," Professor Sun says. "It's a challenging task, and it would benefit Singaporean users from different cultures."
FANN in AME
Professor Sun is currently in the early stages of a three-year project called "Fast-Adapted Neural Networks (FANN) for Advanced AI Systems." The project, which is funded by the Agency for Science, Technology and Research (A*STAR) under its Advanced Manufacturing and Engineering Young Individual Research Grant (AME YIRG), focuses on computer vision such as image processing, image recognition, or object detection in video. Computer vision algorithms usually rely on convolutional neural networks, or CNNs which is her area of expertise.
"The key hypothesis of the research is that it is possible to build the reasoning level of model adaptation based on statistical-level knowledge learning," Professor Sun explains. "By validating this hypothesis, we are also approaching the goal of advanced AI systems that train machine models with human-like intelligence for the applications in AME domains."
The research aims to achieve high robustness and computational efficiency in automated visual inspection, and interdisciplinary knowledge between precision manufacturing and advanced image-recognition techniques. Professor Sun is confident the outcomes of the research will greatly improve the yield rate and reduce manufacturing costs when the fast-adapted inspection devices are widely installed in the design, layout, fabrication, assembly, and testing processes of production lines.