October 23, 2023

A new algorithm for building robust distributed systems

by Tanya Petersen, Ecole Polytechnique Federale de Lausanne

EPFL researchers have developed a new distributed algorithm that, for the first time, solves one of the key performance and reliability problems affecting most of the currently-deployed consensus protocols. The work has been published in Proceedings of the 29th Symposium on Operating Systems Principles.

Consensus is one of the fundamental problems in distributed systems. It allows a group of machines to maintain multiple copies of data and update them consistently, even when a fraction of the machines might fail.

Take the example of three servers that need to store three copies of data and keep track of any updates to information so that all three servers remain consistent. If one server fails, the remaining two must keep the data consistent, and allow updates to continue normally as if there was no failure.

Current state-of-the-art consensus protocols to achieve consensus rely on one computer node being designated a leader at any given time, continually supervising and handling any updates to data. If the leader fails another node wakes up and takes over, but there's a challenge. How long should another node wait before taking over from an unresponsive leader?

"If the leader fails or the network is bad, the problem with the classic consensus protocols is that there's the very tricky question of how you decide how big or small the timeout should be," explained Professor Bryan Ford, Head of the Decentralized and Distributed Systems Laboratory (DEDIS) in EPFL's School of Computer and Communications Sciences (IC).

"If you set it too big, then when a leader fails, you might be waiting a long time and the system is just dead. On the other hand, consider if you set the timeout too short—this is where the real disaster can happen."

"Suppose the old leader hasn't failed, suppose the network is just a little slower than you thought it was, the next leader comes and tries to take over, but the way all the existing protocols work, the new leader's actions will cancel what the old leader's actions did so it can no longer finish what it was doing and all its work is wasted. These kinds of issues can cause major reliability problems and these leader-based protocols can fail entirely if there's a deliberate denial of service attack," he continued.

To overcome these challenges, DEDIS researchers have been investigating a rarely-used class of consensus algorithms, known as asynchronous consensus protocols. Unlike current leader-based protocols, their asynchronous cousins are not vulnerable to leader failures and denial of service attacks. But there's a big trade off—prior asynchronous protocols are much less efficient under normal conditions, and that's one reason they are almost never deployed.

For the first time, Ford says, their QuePaxa protocol changes this dynamic. "We've come up with a win-win. What is new and unique to QuePaxa is that it's an asynchronous consensus protocol that finally achieves efficiency equivalent to the widely deployed leader-based protocols under normal network conditions. QuePaxa is just as fast, efficient, low latency and low cost in terms of network bandwidth, under normal conditions."

The new algorithm is designed in such a way that one leader at a time is usually expected to lead the task of making progress, but a second leader can come in and help in the same round without interfering with the first one. A third leader could even join and help the other two finish the work more quickly. There will be some redundancy of effort, but the non-leaders don't destructively interfere. Short delays don't cause leaders to cancel each others' work as with current protocols.

Another advantage of QuePaxa is that it is also extremely robust under bad conditions such as noisy networks, high communication delays, unpredictably-varying network delays, or deliberate denial-of-service attacks.

"Under these conditions existing consensus protocols will just die completely. QuePaxa will keep going; it's much more robust," he continued. "In any place where there are significant concerns about performance, reliability or vulnerability to these kinds of attacks I believe this is a game changer for robustness reasons and this should be the new standard consensus protocol."

The DEDIS team has already built an open source prototype of QuePaxa, which is available on the well-known GitHub repository. The new protocol has already gone through an artifact evaluation review process at SOSP, where peer reviewers have tested its capabilities.

The paper, "QuePaxa: Escaping the tyranny of timeouts in consensus," was presented at the biennial Association for Computing Machinery (ACM) Symposium on Operating Systems Principles (SOSP).

More information: Pasindu Tennage et al, QuePaxa: Escaping the tyranny of timeouts in consensus, Proceedings of the 29th Symposium on Operating Systems Principles (2023). DOI: 10.1145/3600006.3613150

Provided by Ecole Polytechnique Federale de Lausanne

Citation: A new algorithm for building robust distributed systems (2023, October 23) retrieved 17 July 2024 from https://techxplore.com/news/2023-10-algorithm-robust.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Asynchronous distributed PEV charging protocol: Powering the future of electric vehicles

35 shares

Feedback to editors

The magnet trick: New invention makes vibrations disappear

44 minutes ago

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

1 hour ago

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

1 hour ago

Scientists bridge the 'valley of death' in carbon capture technologies

1 hour ago

Flexible electronics researchers develop a completely stretchy lithium-ion battery

5 hours ago

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

6 hours ago

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

21 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

23 hours ago

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Jul 16, 2024

Large language models make human-like reasoning mistakes, researchers find

Jul 16, 2024

Load comments (0)

A new algorithm for building robust distributed systems

The magnet trick: New invention makes vibrations disappear

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

Scientists bridge the 'valley of death' in carbon capture technologies

Flexible electronics researchers develop a completely stretchy lithium-ion battery

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Asynchronous distributed PEV charging protocol: Powering the future of electric vehicles

Researchers publish first harmonized exposure protocol for ecotoxicity testing of micro- and nano-plastics

Novel algorithm improves 'consensus' performance in multi-agent systems

Blockchain-based vehicular edge computing networks: The communication perspective

Distributed protocol underpinning cloud computing automatically determined safe and secure

A universal protocol that inverts the evolution of a qubit with a high probability of success

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Visual abilities of language models found to be lacking depth

Reasoning skills of large language models are often overestimated, researchers find

A new model to plan and control the movements of humanoids in 3D environments

Researchers introduce generative AI to analyze complex tabular data

Computer scientists develop new and improved camera inspired by the human eye

Phys.org

Medical Xpress

Science X

A new algorithm for building robust distributed systems

The magnet trick: New invention makes vibrations disappear

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

Scientists bridge the 'valley of death' in carbon capture technologies

Flexible electronics researchers develop a completely stretchy lithium-ion battery

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Related Stories

Asynchronous distributed PEV charging protocol: Powering the future of electric vehicles

Researchers publish first harmonized exposure protocol for ecotoxicity testing of micro- and nano-plastics

Novel algorithm improves 'consensus' performance in multi-agent systems

Blockchain-based vehicular edge computing networks: The communication perspective

Distributed protocol underpinning cloud computing automatically determined safe and secure

A universal protocol that inverts the evolution of a qubit with a high probability of success

Recommended for you

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Visual abilities of language models found to be lacking depth

Reasoning skills of large language models are often overestimated, researchers find

A new model to plan and control the movements of humanoids in 3D environments

Researchers introduce generative AI to analyze complex tabular data

Computer scientists develop new and improved camera inspired by the human eye

Your Privacy