ChatGPT for Health Advice — What It Gets Right and Dangerously Wrong

Artificial intelligence (AI) has been storming the world, with its promise of a transformative technology that’ll benefit humankind. Once limited to research labs and tech firms, AI is now commonly used in the home. Virtual assistants like Siri and recommendation systems on Netflix and Spotify are examples of AI in everyday use. AI is promised as a technology that’ll help us complete tasks more efficiently, from writing emails to choosing products to making daily decisions.

Generative Pre-trained Transformer (GPT) is a type of AI that can understand instructions and generate text, images, music, or other content that looks like it was made by a human. ChatGPT is an AI application that can “chat” with a user—thus referred to as a “chatbot.” Besides generating content, it can also answer questions, reason, and summarize information that it finds online. Therefore, a lot of people are using ChatGPT to research medical questions and seek health advice. But is it wise to rely on ChatGPT for your health advice?

ChatGPT is an AI chatbot developed by OpenAI and released in 2022. Source: Open source.

How People Use ChatGPT for Health Advice

Rather than sifting through information on multiple websites, people are using ChatGPT to search for health and medical information. They can ask questions in a conversational way about symptoms, conditions, medications, or even enter a picture of a rash or other concern, and receive summarized, easy-to-understand explanations. ChatGPT can also interpret and explain medical terminology on lab or imaging reports.

However, when we resort to getting our health and medical advice from a computer algorithm, we’re placing our well-being in its hands. Therefore, we must be cautious and discriminating in what advice we take and follow. Of course, the same applies to advice that we get from human healthcare practitioners, as they’re not 100% infallible either. So, let’s look at where ChatGPT is relatively reliable, and where it is not.

What ChatGPT Gets Right

Studies published in the New England Journal of Medicine found that a GPT app correctly answered questions on US medical licensing exams and correctly diagnosed 57% of clinical cases. It outperformed randomly polled readers of unknown medical training and skills, who only correctly diagnosed 36%.

However, when a trained physician encounters a patient face-to-face, those numbers would almost certainly change. There, they would have the benefit of being able to ask questions and collect data that are not provided in a case summary.

Here is when ChatGPT may be most reliable:

Explaining medical terms and concepts
Helping formulate questions before a doctor’s appointment
Summarizing general information from the internet about conditions, medications, or health tips
Providing emotional support (according to Harvard Health, ChatGPT may be more empathetic than doctors)

What ChatGPT Gets Dangerously Wrong

A study published in Nature Medicine found that only about one-third of people who input their symptoms received a correct diagnosis. Further, only about 43% received correct advice about what to do next (such as proceed to the emergency room or stay home).

Patients waiting at Hospital — GPT does very poorly at triaging people’s illnesses and may give unsafe advice.

The problem partly arises from the information and word choices that users input into ChatGPT. A layperson may describe a symptom with a term that doesn’t capture the medical term. For example, the term “dizzy” may be used by a layperson to describe two vastly different symptoms with very different causes: vertigo (a sensation of spinning or swaying) and presyncope (lightheadedness). Or a person may describe a headache as “bad” or “the worst headache I’ve ever had” and get two different diagnoses and advice for next steps.

Healthcare providers are trained to ask questions that are relevant to your condition and identify symptoms that may not occur to the untrained layperson. As well, they benefit from information from a physical exam or tests that are not available at home.

A large study published in Nature Medicine found that ChatGPT “undertriaged” (did not assign the proper urgency for seeking care) in 52% of medical emergencies.

A 2026 study of ChatGPT Health—a version of ChatGPT specially designed for health information—described the software’s real-life performance as being riddled with “considerable and potentially dangerous flaws.”

Here is when ChatGPT may be most unreliable:

Making a self-diagnosis
Situations requiring emergency or urgent care
Complex health cases
Any case in which ChatGPT generates false or fake information (GPT “hallucinations), which is unpredictable

Bottom Line

AI and GPT are rapidly evolving technologies but are still subject to error. Work still needs to be done to further develop the technology and confirm its effectiveness.

Patient talking to his Doctor — AI is far from reaching the capability of replacing a face-to-face visit with a trained health professional.

While ChatGPT can be useful for education and initial guidance, it should not be used as a replacement for professional medical advice. This is especially the case with ongoing or worrisome symptoms or conditions. Some doctors advise that the best time to use ChatGPT is before and after a visit to the doctor to help you better prepare for your visit and better understand the outcome.

References:

NPR news website
Nature Medicine medical journal
New England Journal of Medicine medical journal
Nature Medicine medical journal
British Medical Journal
Harvard Health website
Medical News Today

How People Use ChatGPT for Health Advice

What ChatGPT Gets Right

What ChatGPT Gets Dangerously Wrong

Bottom Line

References:

Share this post:

By Andrew Proulx

Subscribe For Free