August 29, 2022

ROBE Array could let small companies access popular form of AI

A breakthrough low-memory technique by Rice University computer scientists could put one of the most resource-intensive forms of artificial intelligence—deep-learning recommendation models (DLRM)—within reach of small companies.

DLRM recommendation systems are a popular form of AI that learns to make suggestions users will find relevant. But with top-of-the-line training models requiring more than a hundred terabytes of memory and supercomputer-scale processing, they've only been available to a short list of technology giants with deep pockets.

Rice's "random offset block embedding array," or ROBE Array, could change that. It's an algorithmic approach for slashing the size of DLRM memory structures called embedding tables, and it will be presented this week at the Conference on Machine Learning and Systems (MLSys 2022) in Santa Clara, California, where it earned Outstanding Paper honors.

"Using just 100 megabytes of memory and a single GPU, we showed we could match the training times and double the inference efficiency of state-of-the-art DLRM training methods that require 100 gigabytes of memory and multiple processors," said Anshumali Shrivastava, an associate professor of computer science at Rice who's presenting the research at MLSys 2022 with ROBE Array co-creators Aditya Desai, a Rice graduate student in Shrivastava's research group, and Li Chou, a former postdoctoral researcher at Rice who is now at West Texas A&M University.

"ROBE Array sets a new baseline for DLRM compression," Shrivastava said. "And it brings DLRM within reach of average users who do not have access to the high-end hardware or the engineering expertise one needs to train models that are hundreds of terabytes in size."

DLRM systems are machine learning algorithms that learn from data. For example, a recommendation system that suggests products for shoppers would be trained with data from past transactions, including the search terms users provided, which products they were offered and which, if any, they purchased. One way to improve the accuracy of recommendations is to sort training data into more categories. For example, rather than putting all shampoos in a single category, a company could create categories for men's, women's and children's shampoos.

For training, these categorical representations are organized in memory structures called embedding tables, and Desai said the size of those tables "have exploded" due to increased categorization.

"Embedding tables now account for more than 99.9% of the overall memory footprint of DLRM models," Desai said. "This leads to a host of problems. For example, they can't be trained in a purely parallel fashion because the model has to be broken into pieces and distributed across multiple training nodes and GPUs. And after they're trained and in production, looking up information in embedded tables accounts for about 80% of the time required to return a suggestion to a user."

Shrivastava said ROBE Array does away with the need for storing embedding tables by using a data-indexing method called hashing to create "a single array of learned parameters that is a compressed representation of the embedding table." Accessing embedding information from the array can then be performed "using GPU-friendly universal hashing," he said.

Shrivastava, Desai and Chou tested ROBE Array using the sought after DLRM MLPerf benchmark, which measures how fast a system can train models to a target quality metric. Using a number of benchmark data sets, they found ROBE Array could match or beat previously published DLRM techniques in terms of training accuracy even after compressing the model by three orders of magnitude.

"Our results clearly show that most deep-learning benchmarks can be completely overturned by fundamental algorithms," Shrivastava said. "Given the global chip shortage, this is welcome news for the future of AI."

ROBE Array isn't Shrivastava's first big splash at MLSys. At MLSys 2020, his group unveiled SLIDE, a "sub-linear deep learning engine" that ran on commodity CPUs and could outperform GPU-based trainers. They followed up at MLSys 2021, showing vectorization and memory optimization accelerators could boost SLIDE's performance, allowing it to train deep neural nets up to 15 times faster than top GPU systems.

More information: Random Offset Block Embedding (ROBE) for compressed embedding tables in deep learning recommendation systems

Provided by Rice University

Citation: ROBE Array could let small companies access popular form of AI (2022, August 29) retrieved 17 July 2024 from https://techxplore.com/news/2022-08-robe-array-small-companies-access.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers

136 shares

Feedback to editors

The magnet trick: New invention makes vibrations disappear

54 minutes ago

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

1 hour ago

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

2 hours ago

Scientists bridge the 'valley of death' in carbon capture technologies

2 hours ago

Flexible electronics researchers develop a completely stretchy lithium-ion battery

5 hours ago

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

6 hours ago

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

21 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Jul 16, 2024

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Jul 16, 2024

Large language models make human-like reasoning mistakes, researchers find

Jul 16, 2024

Load comments (0)

ROBE Array could let small companies access popular form of AI

The magnet trick: New invention makes vibrations disappear

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

Scientists bridge the 'valley of death' in carbon capture technologies

Flexible electronics researchers develop a completely stretchy lithium-ion battery

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers

Deep learning rethink overcomes major obstacle in AI industry

Big data privacy for machine learning just got 100 times cheaper

Researchers report breakthrough in 'distributed deep learning'

Scientists slash computations for deep learning

Bad news for fake news: New research helps combat social media misinformation

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

New system enables intuitive teleoperation of a robotic manipulator in real-time

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

Phys.org

Medical Xpress

Science X

ROBE Array could let small companies access popular form of AI

The magnet trick: New invention makes vibrations disappear

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

Scientists bridge the 'valley of death' in carbon capture technologies

Flexible electronics researchers develop a completely stretchy lithium-ion battery

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Related Stories

CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers

Deep learning rethink overcomes major obstacle in AI industry

Big data privacy for machine learning just got 100 times cheaper

Researchers report breakthrough in 'distributed deep learning'

Scientists slash computations for deep learning

Bad news for fake news: New research helps combat social media misinformation

Recommended for you

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

New system enables intuitive teleoperation of a robotic manipulator in real-time

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

Your Privacy