March 11, 2024

TaskMatrix.AI: Making big models do small jobs with application programming interfaces

A research team at Microsoft has designed an efficiency tool called TaskMatrix.AI that can be used to accomplish a wide variety of specific AI tasks. TaskMatrix.AI connects general-purpose foundation models like GPT-4, the model behind ChatGPT, with specialized models suitable for certain tasks—much like a human project manager. This research was published in Intelligent Computing.

Foundation models and specialized models usually have different mechanisms and, thus, are not easily compatible. Rather than modifying and integrating existing models, TaskMatrix.AI bridges the gaps between them through application programming interfaces, or APIs, which enable software components to communicate.

The research team envisioned an AI ecosystem applicable to office automation, robotics, the Internet of Things, and other domains. Accordingly, their TaskMatrix.AI can perform various digital and physical tasks, give interpretable responses, and learn continuously.

TaskMatrix.AI has four key components: a conversational foundation model that understands user inputs across various modalities (such as text and images) and generates executable action code as input for APIs; an API platform that holds a vast repository of APIs and their documentation; an API selector that chooses the most suitable APIs for the foundation model and an action executor that executes the code given by the model.

As the ecosystem evolves, API developers can improve the documentation based on user feedback.

The team demonstrated the use of TaskMatrix.AI for processing images and automatically making PowerPoint slides.

During the image processing task, a human interacted with TaskMatrix.AI by typing natural language instructions for complex visual tasks such as image generation, editing, and description. TaskMatrix.AI demonstrated its ability to understand human intentions through text-based inputs and provided satisfactory output.

For example, with a tiny input image of a pink flower with a green background and a single instruction to "extend it to 2048 × 4096," TaskMatrix.AI generated a convincing image of vibrant, colorful flowers against lush green leaves through question-answering, captioning, and object replacement APIs.

The PowerPoint automation task required TaskMatrix.AI to create a set of slides, each introducing a different tech company. ChatGPT served as the foundation model for understanding complex user instructions, such as inserting text, resizing and relocating images, and changing the theme for the PowerPoint slides. For example, TaskMatrix.AI successfully inserted and resized five company logos, which it obtained from the Internet, by calling several relevant APIs.

Despite the preliminary validation of TaskMatrix.AI, the team pointed out some challenges ahead, such as finding and adjusting a powerful foundation model, building and maintaining an ideal API platform and addressing user-level concerns like data security, privacy, and customization needs.

More information: Yaobo Liang et al, TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs, Intelligent Computing (2023). DOI: 10.34133/icomputing.0063

Provided by Intelligent Computing

Citation: TaskMatrix.AI: Making big models do small jobs with application programming interfaces (2024, March 11) retrieved 17 July 2024 from https://techxplore.com/news/2024-03-taskmatrixai-big-small-jobs-application.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Audio explainable artificial intelligence: Demystifying 'black box' models

39 shares

Feedback to editors

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

12 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

14 hours ago

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

16 hours ago

Large language models make human-like reasoning mistakes, researchers find

17 hours ago

Unveiling a new class of synthetic fuels

17 hours ago

Microsoft unveils software that allows LLMs to work with spreadsheets

17 hours ago

New technique to assess a general-purpose AI model's reliability before it's deployed

18 hours ago

New system enables intuitive teleoperation of a robotic manipulator in real-time

21 hours ago

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

22 hours ago

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Jul 15, 2024

Load comments (0)

TaskMatrix.AI: Making big models do small jobs with application programming interfaces

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Audio explainable artificial intelligence: Demystifying 'black box' models

What are APIs? A computer scientist explains the data sockets that make digital life possible

Ultra-fast generative visual intelligence model creates images in just 2 seconds

Researcher develops filter to tackle 'unsafe' AI-generated images

Radiology researchers test large language model that preserves patient privacy

A model that generates complex recipes from images of available ingredients

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Phys.org

Medical Xpress

Science X

TaskMatrix.AI: Making big models do small jobs with application programming interfaces

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Related Stories

Audio explainable artificial intelligence: Demystifying 'black box' models

What are APIs? A computer scientist explains the data sockets that make digital life possible

Ultra-fast generative visual intelligence model creates images in just 2 seconds

Researcher develops filter to tackle 'unsafe' AI-generated images

Radiology researchers test large language model that preserves patient privacy

A model that generates complex recipes from images of available ingredients

Recommended for you

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Your Privacy