June 11, 2019 feature

A new approach for unsupervised paraphrasing without translation

by Ingrid Fadelli , Tech Xplore

In recent years, researchers have been trying to develop methods for automatic paraphrasing, which essentially entails the automated abstraction of semantic content from text. So far, approaches that rely on machine translation (MT) techniques have proved particularly popular due to the lack of available labeled datasets of paraphrased pairs.

Theoretically, translation techniques might appear like effective solutions for automatic paraphrasing, as they abstract semantic content from its linguistic realization. For instance, assigning the same sentence to different translators might result in different translations and a rich set of interpretations, which could be useful in paraphrasing tasks.

Although many researchers have developed translation-based methods for automated paraphrasing, humans do not necessarily need to be bilingual to paraphrase sentences. Based on this observation, two researchers at Google Research have recently proposed a new paraphrasing technique that does not rely on machine translation methods. In their paper, pre-published on arXiv, they compared their monolingual approach to other techniques for paraphrasing: a supervised translation and an unsupervised translation approach.

"This work proposes to learn paraphrasing models from an unlabeled monolingual corpus only," Aurko Roy and David Grangier, the two researchers who carried out the study, wrote in their paper. "To that end, we propose a residual variant of vector-quantized variational auto-encoder."

The model introduced by the researchers is based on vector-quantized auto-encoders (VQ-VAE) that can paraphrase sentences in a purely monolingual setting. It also has a unique feature (i.e. residual connections parallel to the quantized bottleneck), which enables better control over the decoder entropy and eases optimization.

"Compared to continuous auto-encoders, our method permits the generation of diverse, but semantically close sentences from an input sentence," the researchers explained in their paper.

In their study, Roy and Grangier compared their model's performance with that of other MT-based approaches on paraphrase identification, generation and training augmentation. They specifically compared it with a supervised translation method trained on parallel bilingual data and an unsupervised translation method trained on non-parallel text in two different languages. Their model, on the other hand, only requires unlabeled data in a single language, the one it is paraphrasing sentences in.

The researchers found that their monolingual approach outperformed unsupervised translation techniques in all tasks. Comparisons between their model and supervised translation methods, on the other hand, yielded mixed results: the monolingual approach performed better in identification and augmentation tasks, while the supervised translation method was superior for paraphrase generation.

"Overall, we showed that monolingual models can outperform bilingual ones for paraphrase identification and data-augmentation through paraphrasing," the researchers concluded. "We also reported that generation quality from monolingual models can be higher than models based on unsupervised translation, but not supervised translation."

Roy and Grangier's findings suggest that using bilingual parallel data (i.e. texts and their possible translations in other languages) is particularly advantageous when generating paraphrases and leads to remarkable performance. In situations where bilingual data is not readily available, however, the monolingual model proposed by them could be a useful resource or alternative solution.

More information: Unsupervised paraphrasing without translation. arXiv:1905.12752 [cs.LG]. arxiv.org/abs/1905.12752

Citation: A new approach for unsupervised paraphrasing without translation (2019, June 11) retrieved 17 July 2024 from https://techxplore.com/news/2019-06-approach-unsupervised-paraphrasing.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Using multi-task learning for low-latency speech translation

68 shares

Feedback to editors

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

15 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

17 hours ago

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

19 hours ago

Large language models make human-like reasoning mistakes, researchers find

19 hours ago

Unveiling a new class of synthetic fuels

20 hours ago

Microsoft unveils software that allows LLMs to work with spreadsheets

20 hours ago

New technique to assess a general-purpose AI model's reliability before it's deployed

21 hours ago

New system enables intuitive teleoperation of a robotic manipulator in real-time

Jul 16, 2024

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

Jul 16, 2024

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Jul 15, 2024

Load comments (0)

A new approach for unsupervised paraphrasing without translation

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Using multi-task learning for low-latency speech translation

A new approach for low-resource machine transliteration using RNNs

Model paves way for faster, more efficient translations of more languages

Fighting offensive language on social media with unsupervised text style transfer

Google Brain posse takes neural network approach to translation

New technology for machine translation now available

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Phys.org

Medical Xpress

Science X

A new approach for unsupervised paraphrasing without translation

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Related Stories

Using multi-task learning for low-latency speech translation

A new approach for low-resource machine transliteration using RNNs

Model paves way for faster, more efficient translations of more languages

Fighting offensive language on social media with unsupervised text style transfer

Google Brain posse takes neural network approach to translation

New technology for machine translation now available

Recommended for you

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Your Privacy