Susana Manso García

Position

General practitioner, member of the Artificial Intelligence and Digital Health Working Group of the Spanish Society of Family and Community Medicine (semFYC)

Spanish Society of Family and Community Medicine (SemFYC)

Topics

artificial intelligence, public health

AI models are still not reliable for unsupervised medical diagnosis

artificial intelligence

AI models are still not reliable for unsupervised medical diagnosis

SMC Spain

A team from the United States has analysed the performance of 21 large artificial intelligence (AI)-based language models—including ChatGPT, Gemini and Grok—for clinical diagnosis. Their conclusions are that, despite advances in these models, their reasoning capabilities remain limited for initial diagnosis and that they should not be relied upon without the supervision of a medical professional. According to the authors, who published their findings in JAMA Network Open and aimed to “help distinguish reality from hype in the use of these tools”, the results “reinforce the idea that language models in healthcare still require human intervention and very rigorous supervision”.