Imagine a self-driving car navigating downtown traffic. To avoid a collision, it must judge whether the pedestrian at the corner is about to cross. Or consider an investment algorithm trading stocks-it needs to anticipate how human investors will react to news before making a move.
In both cases, machines must do more than compute-they must understand human behavior. But today's general-purpose AI models, like GPT or LIama, aren't built for that.
Enter Be.FM, short for Behavioral Foundation Model, a new AI system developed by researchers at the University of Michigan, Stanford University and MobLab. Be.FM is one of the first AI systems designed specifically to predict, simulate and reason about human actions.
Unlike traditional models that rely on generic text corpuses, Be.FM is trained on behavioral science-specific data-from controlled experiments to surveys and academic studies.

"We're not feeding it Wikipedia," said Yutong Xie, a doctoral student in information science at U-M and the study's lead author. "We built a behavioral dataset-more than 68,000 subjects from experimental data, approximately 20,000 survey respondents and thousands of scientific studies-to help the model reason about why people act the way they do."
That specialized training gives Be.FM an edge over general-purpose AIs, which often overlook minority behaviors or misread complex social cues. For instance, the team's prior work, published in the Proceedings of National Academy of Sciences, shows that off-the-shelf AIs tend to imitate average human behaviors, but fail to cover the diversity of the human distributions. More importantly, Be.FM demonstrates a range of emerging capabilities-skills that researchers did not explicitly program-that fall into four key application areas.
The first and most visible strength of Be.FM is its ability to predict human behavior in real-life situations. For example, Xie described a scenario where a banker offers a few investment options to a group. Be.FM can be used to predict which choices people are likely to prefer and how many will cooperate or take risks. This behavioral forecasting could support economic modeling, product testing or public policy analysis, offering a way to simulate group behavior before launching costly real-world trials.
Be.FM can also deduce psychological traits and demographic information from behavior or background data. In applications, this might mean inferring whether a person is extroverted or agreeable based on their age and gender, as well as other demographic data, or estimating someone's age based on their personality traits. This capability could help researchers segment users more effectively, guide personalized interventions or inform product design.
Human behavior often shifts in response to context, such as changes in timing, social norms or environmental signals. Be.FM can help detect and reason about these drivers.
For instance, when user behavior in an app changes from January to February, Be.FM can help identify what contextual factors might be influencing the shift-such as a design update, a seasonal trend or changes in how information is framed. By analyzing patterns across scenarios, the model can surface insights about the environmental cues shaping decision-making.
This makes it a potentially valuable tool for researchers, designers and policy analysts seeking to understand why behaviors change and how to respond effectively.
Finally, Be.FM can organize and apply behavioral science knowledge to support research workflows. Built on a large language model architecture, it can generate new research ideas, summarize literature or solve applied behavioral economics problems.
For scholars and practitioners, it could become a tool to brainstorm hypotheses, plan studies or even simulate scenarios before field testing.
Across these four categories, Be.FM consistently outperformed commercial and open-source models like GPT-4o and LIama in matching human behavior, particularly in tasks such as personality prediction and scenario simulation. Its predictions more closely reflected real-world patterns, especially at the population level.
Still, the model has limits-its performance beyond these four areas remains untested. It is not yet designed to forecast large-scale political events or predict outcomes like elections or peace deals.

The research team is already working to expand Be.FM's domain coverage.
"Behavior in health, education, even geopolitics-the goal is to make Be.FM useful wherever people make decisions," said Qiaozhu Mei, U-M professor of information and the corresponding author of the study.
The Be.FM models are available upon request. The team invites researchers and practitioners to use the model and share their feedback.