Research Evaluates ChatGPT’s Performance on USMLE

JMIR Publications

In a recent interview posted on JMIR TV, JMIR Publications’ CEO Dr Gunther Eysenbach speaks with Dr Andrew Taylor from Yale University School of Medicine about their paper titled “How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment,” published in JMIR Medical Education.

The study examined how ChatGPT performed on the United States Medical Licensing Examination (USMLE) compared to other AI language models such as InstructGPT and GPT-3. The researchers found that ChatGPT’s performance on the exam was comparable to that of a third-year medical student in terms of medical knowledge assessment, but more importantly, it outperformed the other two models because its dialogic component enabled it to provide clear rationales for its answers.

ChatGPT’s responses were coherent and provided justifiable context. Its accuracy in providing dialogic responses similar to human learners may help create an interactive learning environment for students, supporting problem-solving and reflective practice.

The interview also discusses the limitations of using ChatGPT, such as the need for structured prompts. In their conversation, Dr Eysenbach remarked how the rapid growth of ChatGPT, which has made AI accessible to end consumers, could be a major disruption and technological shift in the field of medical education. They also cited some concerns with ChatGPT’s accuracy in retrieving information such as lack of source identification—a phenomenon called AI hallucination, and the need for additional training or “grounding” of information sources for reliability purposes.

Video Interview with Dr Eysenbach and Dr Taylor here

Dr Eysenbach commented, “There’s certainly more work to be done in specifically training ChatGPT on peer-reviewed literature, and perhaps in connecting ChatGPT with more structured databases, which are out there, like PubMed and CrossRef.”

In conclusion, there is interest in exploring how tools like ChatGPT can be used to improve health care delivery, and Dr Taylor sees potential in using such AI technology in medical education to create a more dynamic learning process for students and practitioners.

“My interest is…how we could potentially use tools like this in the health care system to deliver better and more effective care. And I think we’re going to explore potential avenues for that, and I would love to see further development of this in the medical kind of education space, and I think we will….from a student kind of learning standpoint, it becomes much more dynamic that kind of learning process”, added Dr Taylor.

JMIR Publications plans to publish a special e-collection on this topic, and is inviting authors to submit new research on the use of ChatGPT and generative AI in medical education; see the call for papers here:

About JMIR Medical Education

JMIR Medical Education (JME) is an open access, PubMed-indexed, peer-reviewed journal focusing on technology, innovation, and openness in medical education. This includes e-learning and virtual training, which has gained critical relevance in the (post-)COVID world. Another focus is on how to train health professionals to use digital tools. We publish original research, reviews, viewpoint, and policy papers on innovation and technology in medical education. As an open access journal, we have a special interest in open and free tools and digital learning objects for medical education and urge authors to make their tools and learning objects freely available (we may also publish them as a Multimedia Appendix). We also invite submissions of non-conventional articles (e.g., open medical education material and software resources that are not yet evaluated but free for others to use/implement).

In our “Students’ Corner,” we invite students and trainees from various health professions to submit short essays and viewpoints on all aspects of medical education, particularly suggestions on improving medical education and suggestions for new technologies, applications, and approaches.

The journal is indexed in PubMed, PubMed Central, Scopus, DOAJ, and the Emerging Sources Citation Index (Clarivate).

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.