Credit: CC0 Public Domain

Researchers from the Shenzhen Institutes of Advanced Technology (SIAT) of the Chinese Academy of Sciences have proposed a product image recognition method with guidance learning and noisy supervision. The study was published in Computer Vision and Image Understanding.

Instead of collecting product images by laborious and time-intensive image capturing, the team introduced a novel large-scale called Product-90. Consisting of more than 140K images with 90 categories, the dataset was related to Clothing1M (a large-scale public dataset designed for learning from noisy data with human supervision), but contained many more categories. Images were collected from reviews on e-commerce websites.

In order to avoid unrelated images, the researchers further developed a simple yet efficient learning (GL) method for training (CNNs) with noisy supervision.

They conducted comprehensive evaluations with this proposed guidance learning method on the Product-90 and four public datasets, namely Food101, Food-101N, Clothing1M and synthetic noisy CIFAR-10.

At the , they trained a baseline CNN model (teacher model) on the full Product-90 dataset (without the clean test set). At the second stage, they trained a target network (student network) on the large-scale noisy set and the small clean training set with multi-task learning.

The results exhibited that this proposed guidance learning method was more efficient and simpler, and it has achieved performance superior to state-of-the-art methods on these datasets.

More information: Qing Li et al. Product image recognition with guidance learning and noisy supervision, Computer Vision and Image Understanding (2020). DOI: 10.1016/j.cviu.2020.102963