Genre distribution of a balanced 1k example subset of MusicCaps, according to an AudioSet classifier. Credit: arXiv (2023). DOI: 10.48550/arxiv.2301.11325

A team of engineers at Google is demonstrating a new music generation AI system called MusicLM. In their paper posted on the arXiv preprint server, the group claims that the new system establishes a new level of composition and high fidelity in songs produced by computers.

The creation of MusicLM is part of a wave of deep-learning AI applications developed with the goal of replicating human mental abilities, such as writing papers, painting, taking tests, talking or creating mathematical proofs.

Several other efforts have been made to create song generation applications, including Dance Diffusion, Jukebox and Riffusion. But each has clear limitations and the songs they produce would never be mistaken for written by a human composer.

In this new effort, the team at Google claims their new system outperforms prior systems, both in the quality of the songs produced and in their adherence to text prompts. Google provides many examples on the Google Research site. One example is "induce the experience of being lost in space." As expected, techno songs tend to turn out better than those that replicate classic songs played on real instruments.

The system was taught to create music by training it on 28,000 hours of songs played by humans. And it can create songs of variable length. It can generate a quick riff, for example, or an entire . And it can even go beyond that by creating songs with movements, as is often found in symphonies, to create the feeling of a story. The system can also accept specifics, such as requests for certain instruments or a particular genre. It also can generate vocals, if asked, or more accurate vocal sounds, though results tend to sound like a robot choir that does not know the lyrics.

Google will not be releasing the app for general use. Testing showed that approximately 1% of the music that is generated by the system is copied directly from a human artist. Thus, they are wary of misappropriation of content and lawsuits.

More information: Andrea Agostinelli et al, MusicLM: Generating Music From Text, arXiv (2023). DOI: 10.48550/arxiv.2301.11325

Journal information: arXiv