Flinders University experts are warning that artificial intelligence (AI) must be carefully evaluated and governed before it is adopted widely in healthcare, saying rapid advances do not automatically translate into safe use for patients.
In an expert commentary titled 'AI can reason like a physician; what comes next? published in Science, Flinders researchers caution that while new AI systems show impressive capabilities, strong results in controlled studies do not mean they are ready for routine use in hospitals or clinics.
The authors say there is an urgent need to understand how emerging AI tools can be safely integrated into everyday clinical practice, with patient outcomes remaining the central focus.
Despite these warnings, the researchers acknowledge that recent advances in AI create genuine opportunities to support doctors, particularly in busy and high-pressure care settings.
The commentary reviews new research showing that advanced reasoning-based AI systems can work through diagnostic scenarios step by step and, in some cases, closely match or even exceed the diagnostic performance of experienced doctors.
Erik Cornelisse, a PhD candidate at Flinders University and co-author of the commentary, says this shift marks a move from simple question answering tools towards algorithms capable of seemingly human-like clinical reasoning on text-based tasks.
However, the Flinders team stresses that real world medical care involves far more than text-based reasoning or test performance.
They say clinical practice depends on physical examination, listening to patients, understanding medical and social context, and taking responsibility for outcomes, elements that current AI systems cannot safely provide on their own.
"Health care decisions are complex, high stakes, and deeply human, and accuracy alone, particularly on just text-based cases, does not make a system safe for patients," says Mr Cornelisse from the College of Medicine and
Public Health.
Senior author Associate Professor Ash Hopkins , an NHMRC Investigator and leader of Flinders' Clinical Cancer Epidemiology Lab, says modern healthcare relies on judgement, accountability, and ethical oversight.
"AI systems have demonstrated that they can reason through clinical problems with similar performance to doctors, notably on the same scenarios used to train clinicians themselves. This presents genuine opportunities to support clinicians in the future," says Associate Professor Hopkins.
"Multiple stakeholders are currently working on the frameworks for AI in terms of legal, professional, or moral responsibility for its decisions, and presently there is a critical need for deliberate and controlled integration into clinical care."
The commentary highlights known risks linked to poorly evaluated systems, including bias, inequitable care, and unintended patient harm.
"History shows that algorithms can worsen outcomes when deployed without sufficient safeguards and can amplify problems as easily as they solve them, particularly when systems are trained on incomplete or unrepresentative data," says Mr Cornelisse.
Looking ahead, the Flinders researchers argue that enthusiasm for medical AI must be matched by strong governance and clearer standards for evaluation.
"We do not allow doctors to practise without supervision and evaluation, and AI should be held to comparable standards," says Mr Cornelisse.
The researchers stress that improvement in real patient outcomes, not exam scores, benchmarks, or demonstrations, must be the true measure of success.
Associate Professor Hopkins says AI holds enormous promise but must be applied responsibly.
"Patients deserve technology that improves care in the real world, not systems that only look impressive in studies," he says.
"With careful design, strong oversight, and rigorous evaluation, AI could become a powerful tool to deliver safer, fairer, and more effective care across health systems worldwide," concludes Associate Professor Hopkins.
The paper, 'AI can reason like a physician; what comes next?', by Ashley M. Hopkins and Erik Cornelisse is published in SCIENCE. DOI 10.1126/science.aeg8766 (link live after embargo lifts)