share this!
4
3
Share
Email

April 12, 2021

Machine learning at speed with in-network aggregation

by King Abdullah University of Science and Technology

Inserting lightweight optimization code in high-speed network devices has enabled a KAUST-led collaboration to increase the speed of machine learning on parallelized computing systems five-fold.

This "in-network aggregation" technology, developed with researchers and systems architects at Intel, Microsoft and the University of Washington, can provide dramatic speed improvements using readily available programmable network hardware.

The fundamental benefit of artificial intelligence (AI) that gives it so much power to "understand" and interact with the world is the machine-learning step, in which the model is trained using large sets of labeled training data. The more data the AI is trained on, the better the model is likely to perform when exposed to new inputs.

The recent burst of AI applications is largely due to better machine learning and the use of larger models and more diverse datasets. Performing the machine-learning computations, however, is an enormously taxing task that increasingly relies on large arrays of computers running the learning algorithm in parallel.

"How to train deep-learning models at a large scale is a very challenging problem," says Marco Canini from the KAUST research team. "The AI models can consist of billions of parameters, and we can use hundreds of processors that need to work efficiently in parallel. In such systems, communication among processors during incremental model updates easily becomes a major performance bottleneck."

The team found a potential solution in new network technology developed by Barefoot Networks, a division of Intel.

"We use Barefoot Networks' new programmable dataplane networking hardware to offload part of the work performed during distributed machine-learning training," explains Amedeo Sapio, a KAUST alumnus who has since joined the Barefoot Networks team at Intel. "Using this new programmable networking hardware, rather than just the network, to move data means that we can perform computations along the network paths."

The key innovation of the team's SwitchML platform is to allow the network hardware to perform the data aggregation task at each synchronization step during the model update phase of the machine-learning process. Not only does this offload part of the computational load, it also significantly reduces the amount of data transmission.

"Although the programmable switch dataplane can do operations very quickly, the operations it can do are limited," says Canini. "So our solution had to be simple enough for the hardware and yet flexible enough to solve challenges such as limited onboard memory capacity. SwitchML addresses this challenge by co-designing the communication network and the distributed training algorithm, achieving an acceleration of up to 5.5 times compared to the state-of-the-art approach."

More information: Scaling Distributed Machine Learning with In-Network Aggregation. arxiv.org/abs/1903.06701 arXiv:1903.06701v2 [cs.DC]

Provided by King Abdullah University of Science and Technology

Citation: Machine learning at speed with in-network aggregation (2021, April 12) retrieved 19 April 2024 from https://techxplore.com/news/2021-04-machine-in-network-aggregation.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers

7 shares

Feedback to editors

Versatile fibers offer improved energy storage capacity for wearable devices

51 minutes ago

Harnessing solar energy for high-efficiency NH₃ production

1 hour ago

A dexterous four-legged robot that can walk and handle objects simultaneously

2 hours ago

Climate change will increase value of residential rooftop solar panels across US, study finds

4 hours ago

Bitcoin's next 'halving' is right around the corner. Here's what you need to know

5 hours ago

Team develops a way to teach a computer to type like a human

16 hours ago

Universal 'cocktail electrolyte' developed for 4.6 V ultra-stable fast charging of commercial lithium-ion batteries

17 hours ago

Garbage could replace a quarter of petroleum-based jet fuel every year

18 hours ago

For more open and equitable public discussions on social media, try 'meronymity'

19 hours ago

Mess is best: Disordered structure of battery-like devices improves performance

19 hours ago

Load comments (0)

Machine learning at speed with in-network aggregation

Versatile fibers offer improved energy storage capacity for wearable devices

Harnessing solar energy for high-efficiency NH₃ production

A dexterous four-legged robot that can walk and handle objects simultaneously

Climate change will increase value of residential rooftop solar panels across US, study finds

Bitcoin's next 'halving' is right around the corner. Here's what you need to know

Team develops a way to teach a computer to type like a human

Universal 'cocktail electrolyte' developed for 4.6 V ultra-stable fast charging of commercial lithium-ion batteries

Garbage could replace a quarter of petroleum-based jet fuel every year

For more open and equitable public discussions on social media, try 'meronymity'

Mess is best: Disordered structure of battery-like devices improves performance

CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers

Photon-based processing units enable more complex machine learning

Developing smarter, faster machine intelligence with light

Accelerating AI computing to the speed of light

Researchers enhance quantum machine learning algorithms

Machine learning aids in simulating dynamics of interacting atoms

Team develops a way to teach a computer to type like a human

Meta's newest AI model beats some peers. But its amped-up AI agents are confusing Facebook users

Using sim-to-real reinforcement learning to train robots to do simple tasks in broad environments

Researchers use machine learning to create a fabric-based touch sensor

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

Taichi: A large-scale diffractive hybrid photonic AI chiplet

Phys.org

Medical Xpress

Science X

Machine learning at speed with in-network aggregation

Versatile fibers offer improved energy storage capacity for wearable devices

Harnessing solar energy for high-efficiency NH₃ production

A dexterous four-legged robot that can walk and handle objects simultaneously

Climate change will increase value of residential rooftop solar panels across US, study finds

Bitcoin's next 'halving' is right around the corner. Here's what you need to know

Team develops a way to teach a computer to type like a human

Universal 'cocktail electrolyte' developed for 4.6 V ultra-stable fast charging of commercial lithium-ion batteries

Garbage could replace a quarter of petroleum-based jet fuel every year

For more open and equitable public discussions on social media, try 'meronymity'

Mess is best: Disordered structure of battery-like devices improves performance

Related Stories

CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers

Photon-based processing units enable more complex machine learning

Developing smarter, faster machine intelligence with light

Accelerating AI computing to the speed of light

Researchers enhance quantum machine learning algorithms

Machine learning aids in simulating dynamics of interacting atoms

Recommended for you

Team develops a way to teach a computer to type like a human

Meta's newest AI model beats some peers. But its amped-up AI agents are confusing Facebook users

Using sim-to-real reinforcement learning to train robots to do simple tasks in broad environments

Researchers use machine learning to create a fabric-based touch sensor

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

Taichi: A large-scale diffractive hybrid photonic AI chiplet

Your Privacy