Artificial intelligence is increasingly able to simulate human behavior and answer online surveys and political polls, putting the reliability of survey-based research at risk. Consequences can be serious not only for science and research - online surveys are a cornerstone of modern social-science research - but also for policy and participation of people in democratic processes, as surveys are widely used in political polls. This is a worry expressed in a comment in Nature by three researchers working at the IMT School for Advanced Studies Lucca and the University of Cambridge. The researchers propose possible solutions to this problem, which is becoming an increasingly "cat-and-mouse" game for survey designers as AI systems rapidly adapt to existing detection methods.
Studies suggest that a non-negligible number of survey responses, from 4 to 90 percent in certain populations, can be false or fraudulent. At the same time, a parallel industry of autonomous AI agents has rapidly emerged, capable of completing polls and questionnaires with little to no human supervision. Platforms such as Amazon Mechanical Turk, Prolific, and Lucid have long allowed researchers to collect data quickly and cheaply. But the same systems are now vulnerable to widespread manipulation. "The problem is that the tools used until now to distinguish humans from non-humans are no longer effective. We can't tell whether the responder is a person or not. And the problem is that all the data is potentially contaminated", explains Folco Panizza, researcher at the IMT School and one of the authors of the commentary. Even small amounts of polluted data can seriously distort results and, in the case of small effects, as little as 3–7 percent of fraudulent responses can invalidate statistical conclusions. "With AI agents, the scale of the problem has fundamentally changed", write the authors.
Unlike earlier bots, today's AI systems can generate fluent, coherent, and context-aware responses — often outperforming humans on tasks designed to detect low-quality or automated participation. Traditional safeguards are CAPTCHAs (acronym of "Completely Automated Public Turing test to Tell Computers and Humans Apart"), the test for recognizing images or words and letters we frequently encounter on the web, or attention checks, questions or instructions inserted into online questionnaires to verify that participants are reading carefully and are not answering at random.
But these methods are increasingly ineffective against advanced models. The authors propose a multi-pronged shift in strategy. One approach involves analyzing response patterns and behavioural "paradata", such as typing speed, keystrokes and copy–paste behaviour, to identify responses that are statistically unlikely to be human. Another is a greater reliance on vetted, probability-based survey panels that verify participants' identities.
But, most provocatively, the authors suggest flipping the logic of bot detection entirely: designing tasks that exploit the limits of human reasoning, rather than the weaknesses of AI. "Machines are very good at imitating the behavior of human beings, but they are much less good at imitating making mistakes humans do," says Panizza. For this reason, new methods to unmask bots can include classic probability puzzles, rapid estimation problems and perceptual tasks that humans typically struggle with under time pressure, but which AI systems solve easily and confidently. "If an agent answers too well, that itself can become a signal," explains Panizza. Forcing AI systems to deliberately 'fail like a human' could be far harder than simply answering correctly.
"This is an extremely tricky problem because the machines learn from our efforts to detect them. Another issue is that people can get paid to fill in surveys online, which can create the incentive of completing them as quickly and easily as possible. So it's not just the technology, but the combination of easily accessible advanced AI and financial incentives. No one solution is likely to be sufficient, and "what works" will probably keep changing as AI models and survey pollution methods advance".
According to the authors, researchers, survey platforms and funders should urgently rethink standards for data integrity, combining higher-quality sampling, smarter detection tools and greater transparency. As AI capabilities continue to advance, protecting the credibility of social-science research will require constant adaptation and a willingness to rethink some of our most basic assumptions about what it means to collect data from humans.