A new machine learning model, TweetyBERT, automatically segments and classifies canary vocalizations with expert-level accuracy, offering a scalable platform for neuroscience, providing insights to the neural basis of how the brain learns and produces language, and offering potential applications of understanding animal vocalization more broadly. The study by University of Oregon researchers appears in the scientific journal Patterns.
"Current AI methods for analyzing animal vocalizations require human-labeled training data, a slow and labor-intensive process. We developed TweetyBERT, a self-supervised neural network for analyzing birdsongs. It can rapidly process unlabeled vocal recordings, identify communication units, and annotate sequences" says Tim Gardner, associate professor of bioengineering at the University of Oregon's Knight Campus.
Neuroscientists use canaries, or songbirds, because of their remarkable ability to learn complex and lengthy songs throughout their lives, providing a window into the neural basis of complex learned behaviors. George Vengrovski, a graduate student in Gardner's lab, developed TweetyBERT as a means of automatically annotating the songs of canaries, which consist of 30 to 40 distinct syllables strung into sequences. He says it may change our understanding of how the brain produces speech.
The tool adapts BERT, the language AI architecture underlying early versions of large language models like ChatGPT, to handle the unique acoustic structure of birdsong. This self-supervised transformer neural network is trained to predict masked or hidden fragments of audio without exposure to human supervision or labels, and it autonomously learns the behavioral units of song — such as notes, syllables, and phrases — performing similarly to expert annotators. This ability to classify and annotate songs quickly, finding differences across individuals and tracking how songs change over time, can help neuroscientists uncover the neural underpinnings of how the brain learns and produces language.
Beyond neuroscience, with modification, TweetyBERT could be applied to natural bird populations, identifying changes in vocal patterns that might reveal how birds are responding to expanding human infrastructure and climate change.
"We built this for canaries, but the underlying approach isn't species-specific, and the world is full of birds whose vocal behavior we're barely tracking. With some modifications, the applications of TweetyBERT start to look very different," says Gardner.
The underlying approach behind TweetyBERT is already being used for dolphins and whales, suggesting it could extend well beyond birds and deepen our understanding of animal communication more broadly.