June 26, 2024 report

Software engineers develop a way to run AI language models without matrix multiplication

by Bob Yirka , Tech Xplore

A team of software engineers at the University of California, working with one colleague from Soochow University and another from LuxiTec, has developed a way to run AI language models without using matrix multiplication. The team has published a paper on the arXiv preprint server describing their new approach and how well it has worked during testing.

As the power of LLMs such as ChatGPT has grown, so too have the computing resources they require. Part of the process of running LLMs involves performing matrix multiplication (MatMul), where data is combined with weights in neural networks to provide likely best answers to queries.

Early on, AI researchers discovered that graphics processing units (GPUs) were ideally suited to neural network applications because they can run multiple processes simultaneously—in this case, multiple MatMuls. But now, even with huge clusters of GPUs, MatMuls have become bottlenecks as the power of LLMs grows along with the number of people using them.

In this new study, the research team claims to have developed a way to run AI language models without the need to carry out MatMuls—and to do it just as efficiently.

To achieve this feat, the research team took a new approach to how data is weighted—they replaced the current method that relies on 16-bit floating points with one that uses just three: {-1, 0, 1} along with new functions that carry out the same types of operations as the prior method.

They also developed new quantization techniques that helped boost performance. With fewer weights, less processing is needed, resulting in the need for less computing power. But they also radically changed the way LLMs are processed by using what they describe as a MatMul-free linear gated recurrent unit (MLGRU) in the place of traditional transformer blocks.

In testing their new ideas, the researchers found that a system using their new approach achieved a performance that was on par with state-of-the-art systems currently in use. At the same time, they found that their system used far less computing power and electricity than is generally the case with traditional systems.

More information: Rui-Jie Zhu et al, Scalable MatMul-free Language Modeling, arXiv (2024). DOI: 10.48550/arxiv.2406.02528

Journal information: arXiv

Citation: Software engineers develop a way to run AI language models without matrix multiplication (2024, June 26) retrieved 29 June 2024 from https://techxplore.com/news/2024-06-software-ai-language-matrix-multiplication.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Apple claims its new AI outperforms GPT-4 on some tasks by including on-screen content and background context

59 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Software engineers develop a way to run AI language models without matrix multiplication

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Apple claims its new AI outperforms GPT-4 on some tasks by including on-screen content and background context

Researcher suggests how to effectively utilize large language models

Two types of LLMs found able to equal or outperform humans on theory of mind tests

Bringing GPT to the grid: The promise and limitations of large-language models in the energy sector

Researchers find LLMs are easy to manipulate into giving harmful information

Scientists find ChatGPT is inaccurate when answering computer programming questions

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

New tool detects AI-generated videos with 93.7% accuracy

Convolutional optical neural networks herald a new era for AI imaging

Phys.org

Medical Xpress

Science X

Software engineers develop a way to run AI language models without matrix multiplication

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Apple claims its new AI outperforms GPT-4 on some tasks by including on-screen content and background context

Researcher suggests how to effectively utilize large language models

Two types of LLMs found able to equal or outperform humans on theory of mind tests

Bringing GPT to the grid: The promise and limitations of large-language models in the energy sector

Researchers find LLMs are easy to manipulate into giving harmful information

Scientists find ChatGPT is inaccurate when answering computer programming questions

Recommended for you

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

New tool detects AI-generated videos with 93.7% accuracy

Convolutional optical neural networks herald a new era for AI imaging

Your Privacy