Have you ever wanted to hear a concerto for piano and harp, in the style of Mozart by path of Katy Perry? Well, why not? Because now you can, with OpenAI’s latest (and blessedly not potentially disastrous) creation, MuseNet. This device learning version produces never-before-heard sound based on its knowledge of artists and a few bars to fake it with.
This is far from unprecedented — computer-generated sound has been around for decades — but OpenAI’s reach appears to be flexible and scalable, producing sound informed by a variety of genres and artists, and cross-pollinating them as well in a form of auditory style transfer. It shares a lot of DNA with GPT2, the language version “too hazardous to release,” but the danger of unleashing unlimited sound on the world seems tiny compared with undetectable computer-generated text.
MuseNet was trained on works from dozens of artists, from well-known historical figures like Chopin and Bach to (comparatively) modern artists like Adele and the Beatles, plus collections of African, Arabic and Indian sound. Its complex device learning system paid a fantastic deal of “attention,” which is a technical term in AI work for, essentially, the amount of context the version uses to inform the next step in its creation.
Take, for example, a piece by Mozart. If the version only attended to a couple seconds at a moment, it would never be able to learn the larger musical structures of a symphony as it grew and receded, switched tones and instruments. But the version was given enough virtual brainspace to hold onto about four full minutes of sound, more than enough to grasp something like a sedate begin to a large complete, or a basic verse-chorus-verse structure.
Theoretically, that is. The version doesn’t actually understand sound theory, just that this note followed this note, which followed this note, which tends to come after this type of chord, and so on. Its creations are elementary in their structure, but it’s beautiful clear listening to them that it is indeed successfully aping the songs it ingested.
What makes it great is that an individual version does this reliably across so many types of sound. AIs have been created, like the fabulous Google Doodle for Bach’s birthday a couple weeks back, that focus on a precise artist or genre. And as a comparison I’ve been listening to Generative.fm, which creates just the type of sparse ambient sound I like to listen to while I work (if you like it too, check out one of my beloved labels, Serein). But both those models have their limits very strictly defined. Not so with MuseNet.
In addition to being able to belt out eternal bluegrass or baroque piano pieces, MuseNet can registerly a style transfer process to combine the characteristics of both. disparate parts of a work can have disparate attributes — in a painting you might have composition, subject, color preference and brush style to begin. Imagine a pre-Raphaelite subject and composition but with Impressionist execution. Sounds enjoyable, right? AI models are fantastic at doing this because they sort of compartmentalize these disparate aspects. It’s the same type of thing in sound: The note preference, cadence and other patterns of a pop sound can be drawn out and used separately from its instrumentation — why not do Beach Boys harmonies on a harp?
It’s a tiny solid, however, to get a sense of the likes of Adele without her unique voice, and the rather basic synths the faction has chosen cheapen the effect overall. And after listening to the “live concert” the faction gave on Twitch for a bit, I wasn’t convinced that MuseNet is the next knocked
device. On the other hand, it beautiful regularly knocked
a good step, especially in jazz and classical improvisations, where a bit of an off note can be played off and the rhythms don’t feel so contrived.
What’s it for? Your concept is as good as anyone’s, really. This field is quite brand-new. MuseNet’s project guide, Christine Payne, is pleased with the version and has already found someone to use it:
As a classically trained pianist, I’m particularly excited to see that MuseNet is able to understand the complex harmonic structures of Beethoven and Chopin. I’m working now with a composer who plans to integrate MuseNet into his own compositions, and I’m excited to see where the future of mankind/AI co-composing will take us.
an openai representative also said that the faction has started integrating works from contemporary composers who want to see how the version interprets or imitates their style.
MuseNet will be available for you to play with through mid-May, at which point it will be taken offline and adjusted based on feedback from users, and soon (think weeks) it will be at least partially open-sourced. I imagine famous combinations and ones that people listened to all the path through will get a bit more weight in the tweaks. Here’s hoping they add a bit more expression to the MIDI execution as well — it does often feel like these pieces are being played by an automaton. But it’s testament to the standard of OpenAI’s work that they frequently sound perfectly good as well.