The language capabilities of today's artificial intelligence systems are astonishing. We can now engage in natural conversations with systems like ChatGPT, Gemini, and many others, with a fluency nearly comparable to that of a human being. Yet we still know very little about the internal processes in these networks that lead to such remarkable results.
A new study published in the Journal of Statistical Mechanics: Theory and Experiment (JSTAT) reveals a piece of this mystery. It shows that when small amounts of data are used for training, neural networks initially rely on the position of words in a sentence. However, as the system is exposed to enough data, it transitions to a new strategy based on the meaning of the words. The study finds that this transition occurs abruptly, once a critical data threshold is crossed — much like a phase transition in physical systems. The findings offer valuable insights for understanding the workings of these models.
Just like a child learning to read, a neural network starts by understanding sentences based on the positions of words: depending on where words are located in a sentence, the network can infer their relationships (are they subjects, verbs, objects?). However, as the training continues — the network "keeps going to school" — a shift occurs: word meaning becomes the primary source of information.
This, the new study explains, is what happens in a simplified model of self-attention mechanism — a core building block of transformer language models, like the ones we use every day (ChatGPT, Gemini, Claude, etc.). A transformer is a neural network architecture designed to process sequences of data, such as text, and it forms the backbone of many modern language models. Transformers specialize in understanding relationships within a sequence and use the self-attention mechanism to assess the importance of each word relative to the others.
"To assess relationships between words," explains Hugo Cui, a postdoctoral researcher at Harvard University and first author of the study, "the network can use two strategies, one of which is to exploit the positions of words." In a language like English, for example, the subject typically precedes the verb, which in turn precedes the object. "Mary eats the apple" is a simple example of this sequence.
"This is the first strategy that spontaneously emerges when the network is trained," Cui explains. "However, in our study, we observed that if training continues and the network receives enough data, at a certain point — once a threshold is crossed — the strategy abruptly shifts: the network starts relying on meaning instead."
"When we designed this work, we simply wanted to study which strategies, or mix of strategies, the networks would adopt. But what we found was somewhat surprising: below a certain threshold, the network relied exclusively on position, while above it, only on meaning."
Cui describes this shift as a phase transition, borrowing a concept from physics. Statistical physics studies systems composed of enormous numbers of particles (like atoms or molecules) by describing their collective behavior statistically. Similarly, neural networks — the foundation of these AI systems — are composed of large numbers of "nodes," or neurons (named by analogy to the human brain), each connected to many others and performing simple operations. The system's intelligence emerges from the interaction of these neurons, a phenomenon that can be described with statistical methods.
This is why we can speak of an abrupt change in network behavior as a phase transition, similar to how water, under certain conditions of temperature and pressure, changes from liquid to gas.
"Understanding from a theoretical viewpoint that the strategy shift happens in this manner is important," Cui emphasizes. "Our networks are simplified compared to the complex models people interact with daily, but they can give us hints to begin to understand the conditions that cause a model to stabilize on one strategy or another. This theoretical knowledge could hopefully be used in the future to make the use of neural networks more efficient, and safer."
The research by Hugo Cui, Freya Behrens, Florent Krzakala, and Lenka Zdeborová, titled "A Phase Transition between Positional and Semantic Learning in a Solvable Model of Dot-Product Attention", is published in JSTAT as part of the Machine Learning 2025 special issue and is included in the proceedings of the NeurIPS 2024 conference.