Close Menu
Healthtost
  • News
  • Mental Health
  • Men’s Health
  • Women’s Health
  • Skin Care
  • Sexual Health
  • Pregnancy
  • Nutrition
  • Fitness
  • Recommended Essentials
What's Hot

Top 10 Vital Health Tips for Men in 2026

March 27, 2026

The new initiative aims to scale up personalized treatments for rare diseases

March 27, 2026

What is the connection between ketamine and the bladder?

March 27, 2026
Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
Healthtost
SUBSCRIBE
  • News

    The new initiative aims to scale up personalized treatments for rare diseases

    March 27, 2026

    Experts establish standardized protocols for pediatric diagnosis of recurrent wheezing

    March 26, 2026

    Bedfont® Scientific CTO selected for Technology Leader of the Year

    March 26, 2026

    Whole grain diets may reduce the risk of inflammatory bowel disease

    March 25, 2026

    Systematic review identifies stress-induced biological activators in oncology

    March 25, 2026
  • Mental Health

    What is hunger in the air? And can it be treated?

    March 24, 2026

    Why bipolar people are not your porn inspiration

    March 21, 2026

    Does medicinal cannabis work for depression, anxiety or PTSD? Our study says there is no evidence

    March 20, 2026

    Anxiety and ADHD can overlap—here’s how to untangle these widespread mental health disorders

    March 16, 2026

    How Mental Health Professionals Can Earn CE…

    March 13, 2026
  • Men’s Health

    What is the connection between ketamine and the bladder?

    March 27, 2026

    Building Muscle and Burning Fat: 4 Week Full Body Dumbbell Workout

    March 26, 2026

    Men under more pressure than ever

    March 26, 2026

    Moderate coffee intake may reduce the risk of heart failure

    March 25, 2026

    The hidden cost of redundancy: How we amplify chronic pain in clinical settings

    March 24, 2026
  • Women’s Health

    Raise your nutritional standards to get the results you deserve

    March 27, 2026

    Her Health Challenge – Black Women’s Health Imperative

    March 26, 2026

    “What is happening to my body?” — Understanding the physical changes during treatment

    March 26, 2026

    What’s Really Happening (and What You Can Do About It) – Vuvatech

    March 25, 2026

    Why “Just Exercise” Is Not Enough: The Power of Precision in Exercise Prescription

    March 24, 2026
  • Skin Care

    The glow that becomes recognizably yours

    March 27, 2026

    How to use Retinal in your skincare routine – Tropic Skincare

    March 25, 2026

    Jeuveau vs Dysport: Which Wrinkle Treatment is Right for You?

    March 24, 2026

    Common causes of sensitive skin and how hypoallergenic care helps

    March 21, 2026

    Facials Los Angeles: The Best Event-Ready Treatments to Book

    March 19, 2026
  • Sexual Health

    Contraceptive services stopped after the ‘Defunding’ of Clinic Visits

    March 24, 2026

    Let’s not forget the “most left behind”! < SRHM

    March 24, 2026

    How long does it take for HIV symptoms to appear?

    March 23, 2026

    Technology-facilitated sexual violence has entered Chat — Alliance for Sexual Health

    March 22, 2026

    Queer Muslims find community through Ramadan

    March 17, 2026
  • Pregnancy

    6 things to bring on a cruise that DON’T. A. TALKS ABOUT (not Magnetic Hooks)

    March 26, 2026

    Empowered principles: Supporting families through birth and beyond

    March 24, 2026

    Military Spouse Hospital Birth Stories in the United States vs. Japan plus Postpartum Mental Health Discussion

    March 22, 2026

    Everything you need to know before visiting a newborn

    March 22, 2026

    Dad’s health before conception could affect baby’s future, study finds

    March 21, 2026
  • Nutrition

    Your March Wellness Horoscope | HUM Nutrition Blog

    March 25, 2026

    Life Updates! • Kath Eats

    March 24, 2026

    Building an anti-inflammatory diet

    March 23, 2026

    Mood-Boosting Breakfast Recipes for Better Gut Health, Balanced Blood Sugar, and Focused Brain

    March 23, 2026

    Update: Florida Toxic Test Methods

    March 22, 2026
  • Fitness

    Top 10 Vital Health Tips for Men in 2026

    March 27, 2026

    The Hidden Health Effects of Burnout (Especially After 40)

    March 26, 2026

    Walking Pad Benefits for Women Over 40

    March 24, 2026

    Using Reflections to Enhance Your Communication Skills

    March 23, 2026

    Healthy Vegetarian Meal Plan: 1500 Calorie Guide

    March 22, 2026
  • Recommended Essentials
Healthtost
Home»News»The impressive diagnostic skills of GPT-4 were demonstrated
News

The impressive diagnostic skills of GPT-4 were demonstrated

healthtostBy healthtostApril 21, 2024No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
The Impressive Diagnostic Skills Of Gpt 4 Were Demonstrated
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email

In a recent study published in the journal PLOS Digital Healthresearchers assessed and compared the clinical knowledge and diagnostic reasoning capabilities of large language models (LLMs) with those of human ophthalmology experts.

Study: Large language models approximate expert-level clinical knowledge and reasoning in ophthalmology: A cross-sectional study. Image credit: ozrimoz / Shutterstock

Record

Generative Pre-trained Transformers (GPTs), GPT-3.5 and GPT-4, are advanced language models trained on massive Internet-based datasets. They power ChatGPT, a conversational artificial intelligence (AI) noted for its medical application success. Despite previous models struggling in specialized medical tests, the GPT-4 shows significant advances. Concerns remain about data “contamination” and the clinical relevance of test scores. Further research is needed to validate the clinical applicability and safety of language models in real medical settings and to address existing limitations in their expert knowledge and reasoning abilities.

About the study

The questions for the Part 2 examination of the Fellowship of the Royal College of Ophthalmologists (FRCOphth) have been extracted from a specialist manual that is not widely available online, minimizing the likelihood of these questions appearing in LLM training data. A total of 360 multiple-choice questions spanning six chapters were extracted and a pool of 90 questions was isolated for a mock examination used to compare the performance of LLMs and MDs. Two researchers aligned these questions with the categories established by the Royal College of Ophthalmologists and classified each question according to Bloom’s levels of cognitive processes. Questions with non-text elements that were unsuitable for LLM entry were excluded.

Exam questions were entered into versions of ChatGPT (GPT-3.5 and GPT-4) to collect responses, repeating the process up to three times per question where necessary. Once other models such as Bard and HuggingChat became available, similar tests were conducted. Correct answers, as defined by the textbook, were noted for comparison.

Five specialist ophthalmologists, three intern ophthalmologists and two general practitioners independently completed the virtual examination to assess the practical application of the models. Their answers were then compared with the answers of the LLMs. After testing, these ophthalmologists rated the LLMs’ responses using a Likert scale to rate accuracy and relevance, without knowing which model provided which response.

The statistical design of this study was powerful enough to detect significant performance differences between LLMs and human doctors, aiming to test the null hypothesis that both would perform similarly. Various statistical tests, including chi-squared and paired t-tests, were applied to compare performance and assess the consistency and reliability of LLM responses versus human responses.

Study results

Of the 360 ​​questions contained in the FRCOphth Part 2 exam manual, 347 were selected for use, including 87 from the mock exam chapter. The exceptions were mostly questions with pictures or tables, which were unsuitable for input into LLM interfaces.

Performance comparisons revealed that GPT-4 significantly outperformed GPT-3.5, with a correct response rate of 61.7% versus 48.41%. This advance in GPT-4 capabilities was consistent across different types of questions and subjects, as outlined by the Royal College of Ophthalmologists. Detailed results and statistical analyzes further confirmed the strong performance of GPT-4, making it a competitive tool even among other LLMs and human doctors, especially junior doctors and trainees.

Exam features and detailed performance data.  Topic and question type distributions are presented along with the scores achieved by LLMs (GPT-3.5, GPT-4, LLaMA and PalM 2), Ophthalmologists (E1-E5), Ophthalmologists (T1-T3) and non-ophthalmologists (T1-T3). specialized young doctors (J1- J2).  Median scores do not necessarily add up to the overall median score, as fractional scores are impossible.Exam features and detailed performance data. Topic and question type distributions are presented along with the scores achieved by LLMs (GPT-3.5, GPT-4, LLaMA and PalM 2), Ophthalmologists (E1-E5), Ophthalmologists (T1-T3) and non-ophthalmologists (T1-T3). specialized young doctors (J1- J2). Median scores do not necessarily add up to the overall median score, as fractional scores are impossible.

In the specially adapted 87-question mock exam, the GPT-4 not only outperformed among LLMs, but also scored comparably to ophthalmologists and significantly better than juniors and interns. Performance across different groups of participants showed that while specialist ophthalmologists maintained the highest accuracy, trainees approached these levels, far outperforming junior non-ophthalmology specialists.

Statistical tests also highlighted that agreement between responses given by different LLMs and human participants was generally low to moderate, indicating variation in reasoning and application of knowledge between groups. This was particularly evident when the differences in knowledge between the models and human doctors were compared.

A detailed examination of the mock questions against the actual exam standards showed that the mock setup closely mirrored the actual FRCOphth Part 2 Written Exam in difficulty and structure, as agreed upon by the ophthalmologists involved. This alignment ensured that the assessment of LLMs and human responses was based on a realistic and clinically relevant framework.

In addition, qualitative feedback from ophthalmologists confirmed a strong preference for GPT-4 over GPT-3.5, correlating with quantitative performance data. The higher accuracy and relevance scores for the GPT-4 highlighted its potential utility in clinical settings, particularly in ophthalmology.

Finally, an analysis of the cases where all LLMs failed to provide the correct answer showed no consistent patterns related to the complexity or subject matter of the questions.

demonstrated diagnostic GPT4 impressive skills
bhanuprakash.cg
healthtost
  • Website

Related Posts

The new initiative aims to scale up personalized treatments for rare diseases

March 27, 2026

Experts establish standardized protocols for pediatric diagnosis of recurrent wheezing

March 26, 2026

Bedfont® Scientific CTO selected for Technology Leader of the Year

March 26, 2026

Leave A Reply Cancel Reply

Don't Miss
Fitness

Top 10 Vital Health Tips for Men in 2026

By healthtostMarch 27, 20260

Without a doubt, maintaining health is essential for humans. To enjoy a fulfilling, happy life,…

The new initiative aims to scale up personalized treatments for rare diseases

March 27, 2026

What is the connection between ketamine and the bladder?

March 27, 2026

Raise your nutritional standards to get the results you deserve

March 27, 2026
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
TAGS
Baby benefits body brain cancer care Day Diet disease exercise finds Fitness food Guide health healthy heart Improve Life Loss Men mental Natural Nutrition Patients People Pregnancy research reveals risk routine sex sexual Skin Skincare study Therapy Tips Top Training Treatment ways weight women Workout
About Us
About Us

Welcome to HealthTost, your trusted source for breaking health news, expert insights, and wellness inspiration. At HealthTost, we are committed to delivering accurate, timely, and empowering information to help you make informed decisions about your health and well-being.

Latest Articles

Top 10 Vital Health Tips for Men in 2026

March 27, 2026

The new initiative aims to scale up personalized treatments for rare diseases

March 27, 2026

What is the connection between ketamine and the bladder?

March 27, 2026
New Comments
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 HealthTost. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.