November 2, 2018 weblog
Finding people in video based on height, cloth color, gender
A special search approach lets you find people in surveillance video just based on their description. The RT headline read, "AI algorithm can find you in CCTV footage without using face recognition." But how? Height, gender, clothing, not facial features, are the giveaways, via an artificial intelligence algorithm.
The work reflects the potential of deep learning techniques. RT makes a useful point for those who may still blur the concept of deep learning with machine learning.
RT wrote that in the researchers' efforts, deep learning traveled "beyond machine learning (where patterns are set into algorithms and require supervision) by incorporating 'self-learning'- to train a convolutional neural network (CNN) to recognize soft biometrics using computer vision."
RT and other sites reported on the team of researchers who created the tool that finds people in CCTV footage.
Hiren Galiyawala, Kenil Shah, Vandit Gajjar and Mehul S. Raval described their work in their paper, "Person Retrieval in Surveillance Video using Height, Color and Gender," submitted in September and now on arXiv. Author affiliations include the School of Engineering and Applied Science, Ahmedabad University and the L. D. College of Engineering, both in India.
Attributes such as these—height, build, clothing—cloth color, cloth type— and gender are called soft biometrics. "The task of person retrieval in the video is very challenging due to occlusion, light condition, camera quality, pose, and zoom. However, attributes like height, cloth color, gender can be deduced from low-quality surveillance video at a distance without cooperation from the subject. Such attributes are known as soft biometrics," the authors wrote.
Tristan Greene, TNW, offered an example, that being a request for females wearing red shirts who are 153 cm tall. The result would be a video clip that has been narrowed down to frames featuring people who meet that criteria.
What were the results? RT and other sites said the algorithm correctly identified 28 persons out of 41 in a dataset with soft biometric attributes and that the researchers said —with only some minor tweaks— accuracy could be improved substantially.
The authors in the abstract said that the color and gender models were fine-tuned using AlexNet. The latter is a convolutional neural network (CNN) that gets its name from its designer, Alex Krizhevsky. The AlexNet is trained on more than 1 million images from the ImageNet database, said MathWorks.
"The network is 8 layers deep and can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals. As a result, the network has learned rich feature representations for a wide range of images."
Tristan Greene in TNW made a case for why their research matters.
Greene found their work exciting for its implications on finding missing persons or tracking suspected criminals.
But, he added, "perhaps just as important is the fact that this is a legitimate answer to the problem of ubiquitous surveillance." An alternative to "ubiquitous" would be only that which was relevant.
Greene said "this paradigm would involve using computers to scour archival footage for only the data is that is at least somewhat relevant. It's a minor distinction, but one that could spell the difference between government voyeurism and citizen protection."
Greene also thought, "if we could feed video to a neural network and let it narrow things down to a few hours of compiled footage, it would be possible to accurately track humans across multiple surveillance feeds."
© 2018 Science X Network