October 30, 2023

Attribute augmentation-based label integration for crowdsourcing

by Frontiers Journals

Crowdsourcing provides an effective and low-cost way to collect labels from crowd workers. Due to the lack of professional knowledge, the quality of crowdsourced labels is relatively low. A common approach to addressing this issue is to collect multiple labels for each instance from different crowd workers and then a label integration method is used to infer its true label. However, almost all existing label integration methods merely make use of the original attribute information and do not pay attention to the quality of the multiple noisy label set of each instance.

To solve these issues, a research team led by Liangxiao JIANG published their new research in Frontiers of Computer Science.

The team proposed a novel three-stage label integration method called attribute augmentation-based label integration (AALI). AALI enhances the performance of label integration by improving the discriminative ability of the original attribute space and identifying the quality of each instance's multiple noisy label set. Experimental results on simulated and real-world crowdsourced datasets demonstrate that AALI outperforms all the other state-of-the-art competitors in terms of label quality and model quality.

In the research, they design an attribute augmentation method to enrich the attribute space, and then develop a filter is to single out reliable instances with high-quality multiple noisy label sets from a crowdsourced dataset. Finally, they use the cross-validation to build multiple component classifiers on reliable instances to predict all instances.

In the first stage, AALI defines class membership probabilities generated from a multiple noisy label set as new attributes and constructs the augmented attributes by concatenating the original attributes with the new attributes. In the second stage, AALI develops a filter to single out reliable instances with high-quality multiple noisy label sets. As a result, the original dataset is divided into a reliable dataset and an unreliable dataset. In the third stage, AALI uses majority voting to initialize integrated labels of all instances in reliable dataset while estimating the certainty of each integrated label and assigning it to the weight of each instance.

Next, AALI uses K-fold cross-validation to build M component classifiers on reliable dataset to predict class probability distributions of all instances. At last, AALI updates the integrated label of each instance in reliable dataset and assigns the integrated label to each instance in unreliable dataset. The extensive experimental results on both simulated and real-world crowdsourced datasets validate the superiority of AALI.

Future work can focus on finding the optimal value of the developed filter's threshold using an optimization method.

More information: Yao Zhang et al, Attribute augmentation-based label integration for crowdsourcing, Frontiers of Computer Science (2022). DOI: 10.1007/s11704-022-2225-z

Provided by Frontiers Journals

Citation: Attribute augmentation-based label integration for crowdsourcing (2023, October 30) retrieved 27 April 2024 from https://techxplore.com/news/2023-10-attribute-augmentation-based-crowdsourcing.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Active label distribution learning via kernel maximum mean discrepancy

3 shares

Feedback to editors

Proof of concept study shows path to easier recycling of solar modules

16 hours ago

New circuit boards can be repeatedly recycled

17 hours ago

Researchers develop an automated benchmark for language-based task planners

17 hours ago

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

18 hours ago

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

18 hours ago

Researchers outline path forward for tandem solar cells

19 hours ago

Researcher develop high-performance amorphous p-type oxide semiconductor

20 hours ago

Scientists create new atomic clock that is both ultra-precise and sturdy

20 hours ago

A framework to compare lithium battery testing data and results during operation

23 hours ago

New approach could make reusing captured carbon far cheaper, less energy-intensive

Apr 26, 2024

Load comments (0)

Attribute augmentation-based label integration for crowdsourcing

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

Active label distribution learning via kernel maximum mean discrepancy

Study enhances validation of MERRA-2 aerosol optical thickness dataset in China

Method to train AI with multilabel classification data

Major machine learning datasets have tens of thousands of errors

New image recognition method proposed based on large-scale dataset

How deep learning empowers cell image analysis

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Phys.org

Medical Xpress

Science X

Attribute augmentation-based label integration for crowdsourcing

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

Related Stories

Active label distribution learning via kernel maximum mean discrepancy

Study enhances validation of MERRA-2 aerosol optical thickness dataset in China

Method to train AI with multilabel classification data

Major machine learning datasets have tens of thousands of errors

New image recognition method proposed based on large-scale dataset

How deep learning empowers cell image analysis

Recommended for you

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Your Privacy