Neural Audio Synthesis could be the biggest innovation in recorded sound since the invention of sampling, according to a music researcher at York.
The Jazz trio Sveið, Dr Federico Reuben, left, Emil Karlsen, centre, and James Mainwaring. Photo: Michael Hodges.
A fascinating recent development enabling musicians to improvise live music with AI-generated sound could be the biggest innovation since the advent of sampling, or perhaps even the invention of recorded sound, according to a music researcher at York.
Dr Federico Reuben is set to release Latent Imprints, a free jazz improvisation album recorded with saxophonist James Mainwaring and drummer Emil Karlsen, under the band name Sveið.
Revolutionary
But the music has a revolutionary twist - it has been performed live using an emerging technology called neural audio synthesis (NAS). NAS enables musicians to improvise in a live setting with AI-generated sounds - effectively 'jamming' onstage with artificial intelligence.
Federico, an Associate Professor at the School of Arts and Creative Technologies, explains:
"NAS employs deep learning, an AI technique where programs are trained on large datasets—in this case, collections of sound recordings—to find features and patterns in the data that enable the generation of new sounds resembling those in the original dataset."
Concerns
Federico acknowledges that the implications of this technology have raised concerns among some artists, including Sir Elton John, who has recently voiced strong opposition to what he sees as inadequate regulation of AI in the creative industries.
While acknowledging the complexities surrounding copyright laws, Federico emphasises that these techniques offer significant potential benefits for both artists and audiences alike.
"Once people see the creative possibilities offered by these tools, I think they'll become truly excited," Federico said, describing one particular NAS technique known as 'timbre transfer'.
"With timbre transfer, for example, an AI model trained on a database of recorded speech can respond in real-time to inputs from a microphone placed in front of a drum kit. When the drummer plays, the AI generates vocal sounds mimicking the drums, creating an effect similar to beatboxing."
Mind-boggling
The result, says Federico, is "mind-boggling" because the AI will try to approximate the rhythms and characteristics of the drums, but with vocal sounds.
In his free jazz trio Sveið, Federico is credited as a 'laptop improviser and live coder', and the group improvises its performances in live settings.
"I've used this technique live with several musicians, but this album marks the debut of a new band featuring Mercury Prize-nominated saxophonist James Mainwaring and Norwegian drummer Emil Karlsen."
On stage, Federico uses laptops and controllers, capturing his fellow musicians' sounds through microphones connected to his computers.
"I place a microphone in front of each musician to analyse their sound signals," he explains. "I see AI in performance as an 'entangled process of co-creation' - I'm live coding and exploring the AI models as the improvisation unfolds, reacting to what the others are playing. This exchange creates all kinds of unexpected sounds and fresh musical ideas, which really brings the performance to life."
Federico also suggests this technology could benefit other areas of the music industry. Beyond free jazz, Federico has ongoing research projects exploring broader possibilities of NAS. Working with Professor Franziska Schroeder, he's investigating more embodied methods of interacting with AI models - using breath, sound, touch, movement and physiological signals from the human body, instead of text prompts.
Another project, "Lotus Code," aims to diversify AI datasets by collaborating with Japanese musicians to create datasets representing Japanese musical traditions.
"A significant issue with AI companies like Udio and SunoAI is their reliance on datasets dominated by popular commercial Western music," he says. "This risks cultural and aesthetic homogenisation, which is why diversifying NAS datasets is essential."
Transformative
Federico believes NAS could revolutionise the music industry, describing it as one of the most transformative recent developments in the history of recorded sound.
"It's all very new and that's what makes it exciting," Federico says. "It's undoubtedly a paradigm shift in what you can do with recorded sound. When sampling emerged, it provided musicians a new avenue of exploration, eventually giving rise to entire genres like hip-hop."
"I think NAS represents a similarly transformative shift in music production and live performance. By working with AI as a collaborator - not a tool to replace musicians - it could open the door to entirely new musical genres and forms of expression."
Latent Imprints by Sveið will be released on 27 June via 577 Records.