A University of Maine study compared how well artificial intelligence models and human clinicians handled complex or sensitive medical cases.
The study published in the Journal of Health Organization and Management in May evaluated more than 7,000 anonymized medical queries from the United States and Australia. The findings outlined where the technology showed promise and what limitations need to be addressed before AI is unleashed on patients — and may inform the future development of AI tools, clinical procedures and public policy. The study also informs efforts to use AI to support healthcare professionals at a time when workforce shortages are growing and clinician burnout is increasing.
The results showed that the accuracy of most AI-generated responses aligned with expert standards of information, especially with factual and procedural queries, but often struggled with "why" and "how" questions.
The study also found that while responses were consistent within a given session, inconsistencies appeared when users posed the same questions in later tests. These discrepancies raise concerns, particularly when a patient's health is at stake. The findings add to a growing body of evidence that will define AI's role in healthcare.
"This isn't about replacing doctors and nurses," said C. Matt Graham, author of the study and associate professor of information systems and security management at the Maine Business School. "It's about augmenting their abilities. AI can be a second set of eyes; it can help clinicians sift through mountains of data, recognize patterns and offer evidence-based recommendations in real time."
The study also compared health metrics, including patient satisfaction, cost and treatment efficacy, across both countries. In Australia, which has a universal healthcare model, patients reported higher satisfaction and one-quarter of cost compared to those in the U.S., where patients also waited twice as long to see providers. Graham notes in the study that health system, regulatory and cultural differences like these will ultimately influence how AI is received and used and that models should be trained to account for these variations.
Artificial emotional intelligence
While the accuracy of a diagnosis matters, so does the way it is delivered. In the study, AI responses frequently lacked the emotional engagement and empathetic nuance often conveyed by human clinicians.
The length of AI responses were strikingly consistent, with most varying between 400 and 475 words. Responses by human clinicians showed far more variation, with more concise answers written in response to simpler questions.
Vocabulary analysis revealed that AI regularly used clinical terms in its responses, which may be hard to understand or feel insensitive to some patients. In situations involving topics such as mental health or terminal illness, AI struggled to convey the compassion that is critical in effective patient-provider relationships.
"Healthcare professionals offer healing that is grounded in human connection, through sight, touch, presence and communication — experiences that AI cannot replicate," said Kelley Strout, associate professor of UMaine's School of Nursing, who was not involved in the study. "The synergy between AI and clinicians' judgment, compassion and application of evidence-based practice has the potential to transform healthcare systems but only if accompanied by rigorous standards, ethical frameworks and safeguards to monitor for errors and unintended consequences."
A stretched health system
The study arrives amid widespread and growing shortages in the U.S. healthcare workforce. Across the country, patients face long wait times, high costs and a shortage of primary care and specialty providers. These barriers are particularly acute in rural regions, where limited access often leads to delayed diagnoses and worsening health outcomes.
A report published by the Health Resources and Services Administration in 2024projected that nonmetro areas will face a 42% shortage of primary care physicians by 2037. While a growing number of nurse practitioners and physician assistants are stepping in to fill the gap, demand for care is growing faster. Between 2022 and 2026, the population of people 65 and older in the U.S. is projected to increase 54%, a trend harboring significant implications for the demand of health services.
Strout said that while AI could help improve patient access and alleviate challenges — such as burnout, which affects more than half of primary care physicians in the U.S. — its use must be carefully approached.
Prioritizing providers and patients
AI-powered tools could support round-the-clock virtual assistance and complement provider-to-patient communication through tools like online patient portals, which have skyrocketed in popularity since 2020. The technology, however, also raises fears of job displacement, and experts warn that rapid implementation without ethical guardrails may exacerbate disparities and compromise care quality.
"Technology is only one part of the solution," said Graham. "We need regulatory standards, human oversight and inclusive datasets. Right now, most AI tools are trained on limited populations. If we're not careful, we risk building systems that reflect and even magnify existing inequalities."
Strout added that as health care systems integrate AI into clinical practice, administrators must ensure that these tools are designed with patients and providers in mind. Lessons from past integration of technology, which at times failed to enhance care delivery, offer valuable guidance for AI developers.
"We must learn from past missteps. The electronic health record (EHR), for example, was largely developed around billing models rather than patient outcomes or provider workflows," Strout said. "As a result, EHR systems have often contributed to frustration among providers and diminished patient satisfaction. We cannot afford to repeat that history with AI."
Other factors, such as accountability for mistakes and patient privacy, are top of mind for medical ethicists, policy makers and AI researchers. Solutions to these ethical questions may vary depending on where they are adopted to account for different cultural and regulatory environments.
As AI continues to develop, many experts believe it will enhance the service efficiency and decision-making that providers offer to patients. The study's findings support the growing consensus that AI's limited ethical and emotional adaptability means that human clinicians remain indispensable. Graham says that, in addition to improving the performance of AI tools, future research should focus on managing ethical risks and adapting AI to diverse healthcare contexts to ensure the technology augments rather than undermines human care.
"Technology should enhance the humanity of medicine, not diminish it," Graham said. "That means designing systems that support clinicians in delivering care, not replacing them altogether."