Researchers propose Repression Network approach for vehicle search

Saliency map of features from different layers in RepNet with PRL. Credit: arXiv:1708.02386 [cs.CV]

(Tech Xplore)—Surveillance cameras looking for a thief's vehicle faces uphill task. Cars look very similar to each other and you know the thief was probably clever enough to change license plates or mess with the plate ID in such a way to escape identification.

Researchers at Peking University are finding another way to track down vehicles. Think facial recognition but this time for cars. Their approach was designed so that even scratches in the car could help out.

Their paper describes their work in technical detail and it is up on the arXiv.

"Learning a Repression Network for Precise Vehicle Search" is by Qiantong Xu, Ke Yan and Yonghong Tian.

In their paper, the authors called attention to where plate identification poses a problem for cameras designed for surveillance.

They wrote that in some surveillance cameras, "the resolution of such cameras is not high enough to clearly show the numbers on license plate. Second, the performance of plate recognition systems decrease dramatically when they try to classify some confusing characters like '8' and 'B', 'O' and '0', 'D' and 'O', etc. Most importantly, license plates are often easily occluded, removed or even faked which makes less relevant to each single ."

They said that a "precise vehicle retrieval algorithm should be able to not only capture coarse-grained attributes like color and model of each vehicle but also learn more discriminative feature representing unique details for it."

Their proposed approach is a "Repression Network" model, as they call it. What do they mean by Repression Network?

Jasper Hamill in The Sun: The phrase " " refers to the technology developed at Peking University. Phoebe Weston in Daily Mail said it's "a multi-task learning framework" searching for a car's distinctive features. A repression layer manages all that data generated, "allowing it to only focus on both broad and salient details."

This is how the authors explained it. "The basic idea of building such a model is that we want the deep network to generate two independent sub-features from two different levels – coarse attributes and details, so that each sub-feature can embed more discriminative information for that level and can be better used to perform precise retrieval tasks."

The researchers stated that experimental results showed "that our RepNet achieves the state-of-the-art performance."

Could it be used to track humans?

Moving forward, the authors mentioned that further investigations into RepNet may involve introducing hash functions to generate binary features "or splitting the convolutional groups into two groups. Besides, we can extend our framework into wider applications like face and person retrieval as well."

What's next? The Daily Mail said that "The software is still in early stages of development and there are currently no plans to roll it out."

More information: Learning a Repression Network for Precise Vehicle Search, arXiv:1708.02386 [cs.CV]

The growing explosion in the use of surveillance cameras in public security highlights the importance of vehicle search from large-scale image databases. Precise vehicle search, aiming at finding out all instances for a given query vehicle image, is a challenging task as different vehicles will look very similar to each other if they share same visual attributes. To address this problem, we propose the Repression Network (RepNet), a novel multi-task learning framework, to learn discriminative features for each vehicle image from both coarse-grained and detailed level simultaneously. Besides, benefited from the satisfactory accuracy of attribute classification, a bucket search method is proposed to reduce the retrieval time while still maintaining competitive performance. We conduct extensive experiments on the revised VehcileID dataset. Experimental results show that our RepNet achieves the state-of-the-art performance and the bucket search method can reduce the retrieval time by about 24 times.