Close Menu
Healthtost
  • News
  • Mental Health
  • Men’s Health
  • Women’s Health
  • Skin Care
  • Sexual Health
  • Pregnancy
  • Nutrition
  • Fitness
  • Recommended Essentials
What's Hot

Baked Egg Muffin Cups with Vegetable Crust

April 17, 2026

Scientists warn of a silent rise in resistant Aspergillus and Candida

April 17, 2026

Clinical barriers hinder access to hormone therapy after cervical cancer treatment

April 16, 2026
Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
Healthtost
SUBSCRIBE
  • News

    Scientists warn of a silent rise in resistant Aspergillus and Candida

    April 17, 2026

    Clinical barriers hinder access to hormone therapy after cervical cancer treatment

    April 16, 2026

    Waters debuts industry’s first extended-range MALS detector for UHPLC/UPLC, powering rapid characterization of large molecules

    April 16, 2026

    The use of electric bicycles and scooters causes an increase in brain injuries

    April 15, 2026

    ORGAPRED Selects CYTOQUBE® from Hamamatsu Photonics for Personalized Oncology Research and Therapeutic Discovery

    April 15, 2026
  • Mental Health

    Can a single mother change her child’s surname in India?

    April 16, 2026

    Is it anxiety or OCD? 2 psychology experts explain the difference

    April 14, 2026

    Understanding the different types of treatment: C…

    April 10, 2026

    How does Medicare’s new Mental Health Check In work? Is this low-intensity CBT likely to help?

    April 10, 2026

    the surprisingly common condition with a scary name

    April 6, 2026
  • Men’s Health

    35-minute bodyweight chest workout routine at home

    April 16, 2026

    Vaping may increase risk of cognitive decline in young adults, study finds

    April 14, 2026

    Opinion: Prediction markets are betting against public health

    April 14, 2026

    A monk’s method for falling asleep fast

    April 13, 2026

    The Future of MenAlive: From Men’s Health to Relational Healing and Transformation

    April 13, 2026
  • Women’s Health

    Strong liver, strong woman: 4 habits every woman should embrace

    April 16, 2026

    How the CEO of Cadence OTC Made Sex Talk

    April 16, 2026

    New developments in screening for osteoporosis and osteopenia

    April 15, 2026

    Are you drinking enough water? 5 simple tips to stay hydrated

    April 15, 2026

    What is urea for dry skin?

    April 13, 2026
  • Skin Care

    Fact or Fiction? 12 skincare myths, busted

    April 15, 2026

    Wait – can makeup really cause a reaction to gluten?

    April 14, 2026

    CoolSculpting Elite – SkinCare Physicians

    April 13, 2026

    Why Your Skin Barrier Is The Most Important Thing You’re Ignoring – Lifeline Skin Care

    April 12, 2026

    Spa Los Angeles: Best Services to Book for Real Results

    April 12, 2026
  • Sexual Health

    Judicial reform is the only real way out of today’s political hell

    April 15, 2026

    Personal and Professional considerations between generations

    April 15, 2026

    Can you get tested for herpes without an outbreak?

    April 14, 2026

    At the Intersection of Autism, LGBTQIA+ Identity and Kink — Sexual Health Alliance

    April 13, 2026

    Endometriosis procedures are reimbursed at lower rates, doctors say

    April 8, 2026
  • Pregnancy

    Is Saffron Milk safe in the 9th month of pregnancy?

    April 16, 2026

    Serious maternal complications affect nearly 3 per cent of pregnancies, Ontario study finds

    April 11, 2026

    Third Trimester Nutrition Guide for Indian Moms

    April 10, 2026

    How your partner can support a happier pregnancy

    April 9, 2026

    Exposure to plastic during pregnancy may be linked to more premature births than expected

    April 4, 2026
  • Nutrition

    Baked Egg Muffin Cups with Vegetable Crust

    April 17, 2026

    Sweet rhubarb butter & strawberry rhubarb

    April 15, 2026

    High protein comfort food for women who are tired of salads

    April 14, 2026

    Blueberry Chia Pudding (Easy Breakfast!) • Kath Eats

    April 13, 2026

    Because cooling potatoes reduces their glycemic load

    April 12, 2026
  • Fitness

    Training Strategies to Build Your Own Terminator Army – Tony Gentilcore

    April 15, 2026

    10 Mental Health Tips for Those Who Work From Home

    April 14, 2026

    7 shoulder exercises that keep your arms strong and pain-free after 40

    April 14, 2026

    Inside The OPEX Method Mentorship: A Coach’s POV with Dr David Skolnik (Week 1)

    April 12, 2026

    Active summer camps that build healthy lifelong habits in 6 US states

    April 12, 2026
  • Recommended Essentials
Healthtost
Home»News»The impressive diagnostic skills of GPT-4 were demonstrated
News

The impressive diagnostic skills of GPT-4 were demonstrated

healthtostBy healthtostApril 21, 2024No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
The Impressive Diagnostic Skills Of Gpt 4 Were Demonstrated
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email

In a recent study published in the journal PLOS Digital Healthresearchers assessed and compared the clinical knowledge and diagnostic reasoning capabilities of large language models (LLMs) with those of human ophthalmology experts.

Study: Large language models approximate expert-level clinical knowledge and reasoning in ophthalmology: A cross-sectional study. Image credit: ozrimoz / Shutterstock

Record

Generative Pre-trained Transformers (GPTs), GPT-3.5 and GPT-4, are advanced language models trained on massive Internet-based datasets. They power ChatGPT, a conversational artificial intelligence (AI) noted for its medical application success. Despite previous models struggling in specialized medical tests, the GPT-4 shows significant advances. Concerns remain about data “contamination” and the clinical relevance of test scores. Further research is needed to validate the clinical applicability and safety of language models in real medical settings and to address existing limitations in their expert knowledge and reasoning abilities.

About the study

The questions for the Part 2 examination of the Fellowship of the Royal College of Ophthalmologists (FRCOphth) have been extracted from a specialist manual that is not widely available online, minimizing the likelihood of these questions appearing in LLM training data. A total of 360 multiple-choice questions spanning six chapters were extracted and a pool of 90 questions was isolated for a mock examination used to compare the performance of LLMs and MDs. Two researchers aligned these questions with the categories established by the Royal College of Ophthalmologists and classified each question according to Bloom’s levels of cognitive processes. Questions with non-text elements that were unsuitable for LLM entry were excluded.

Exam questions were entered into versions of ChatGPT (GPT-3.5 and GPT-4) to collect responses, repeating the process up to three times per question where necessary. Once other models such as Bard and HuggingChat became available, similar tests were conducted. Correct answers, as defined by the textbook, were noted for comparison.

Five specialist ophthalmologists, three intern ophthalmologists and two general practitioners independently completed the virtual examination to assess the practical application of the models. Their answers were then compared with the answers of the LLMs. After testing, these ophthalmologists rated the LLMs’ responses using a Likert scale to rate accuracy and relevance, without knowing which model provided which response.

The statistical design of this study was powerful enough to detect significant performance differences between LLMs and human doctors, aiming to test the null hypothesis that both would perform similarly. Various statistical tests, including chi-squared and paired t-tests, were applied to compare performance and assess the consistency and reliability of LLM responses versus human responses.

Study results

Of the 360 ​​questions contained in the FRCOphth Part 2 exam manual, 347 were selected for use, including 87 from the mock exam chapter. The exceptions were mostly questions with pictures or tables, which were unsuitable for input into LLM interfaces.

Performance comparisons revealed that GPT-4 significantly outperformed GPT-3.5, with a correct response rate of 61.7% versus 48.41%. This advance in GPT-4 capabilities was consistent across different types of questions and subjects, as outlined by the Royal College of Ophthalmologists. Detailed results and statistical analyzes further confirmed the strong performance of GPT-4, making it a competitive tool even among other LLMs and human doctors, especially junior doctors and trainees.

Exam features and detailed performance data.  Topic and question type distributions are presented along with the scores achieved by LLMs (GPT-3.5, GPT-4, LLaMA and PalM 2), Ophthalmologists (E1-E5), Ophthalmologists (T1-T3) and non-ophthalmologists (T1-T3). specialized young doctors (J1- J2).  Median scores do not necessarily add up to the overall median score, as fractional scores are impossible.Exam features and detailed performance data. Topic and question type distributions are presented along with the scores achieved by LLMs (GPT-3.5, GPT-4, LLaMA and PalM 2), Ophthalmologists (E1-E5), Ophthalmologists (T1-T3) and non-ophthalmologists (T1-T3). specialized young doctors (J1- J2). Median scores do not necessarily add up to the overall median score, as fractional scores are impossible.

In the specially adapted 87-question mock exam, the GPT-4 not only outperformed among LLMs, but also scored comparably to ophthalmologists and significantly better than juniors and interns. Performance across different groups of participants showed that while specialist ophthalmologists maintained the highest accuracy, trainees approached these levels, far outperforming junior non-ophthalmology specialists.

Statistical tests also highlighted that agreement between responses given by different LLMs and human participants was generally low to moderate, indicating variation in reasoning and application of knowledge between groups. This was particularly evident when the differences in knowledge between the models and human doctors were compared.

A detailed examination of the mock questions against the actual exam standards showed that the mock setup closely mirrored the actual FRCOphth Part 2 Written Exam in difficulty and structure, as agreed upon by the ophthalmologists involved. This alignment ensured that the assessment of LLMs and human responses was based on a realistic and clinically relevant framework.

In addition, qualitative feedback from ophthalmologists confirmed a strong preference for GPT-4 over GPT-3.5, correlating with quantitative performance data. The higher accuracy and relevance scores for the GPT-4 highlighted its potential utility in clinical settings, particularly in ophthalmology.

Finally, an analysis of the cases where all LLMs failed to provide the correct answer showed no consistent patterns related to the complexity or subject matter of the questions.

demonstrated diagnostic GPT4 impressive skills
bhanuprakash.cg
healthtost
  • Website

Related Posts

Scientists warn of a silent rise in resistant Aspergillus and Candida

April 17, 2026

Clinical barriers hinder access to hormone therapy after cervical cancer treatment

April 16, 2026

Waters debuts industry’s first extended-range MALS detector for UHPLC/UPLC, powering rapid characterization of large molecules

April 16, 2026

Leave A Reply Cancel Reply

Don't Miss
Nutrition

Baked Egg Muffin Cups with Vegetable Crust

By healthtostApril 17, 20260

We recently got back from a bucket trip to Disney with my three kids and…

Scientists warn of a silent rise in resistant Aspergillus and Candida

April 17, 2026

Clinical barriers hinder access to hormone therapy after cervical cancer treatment

April 16, 2026

Can a single mother change her child’s surname in India?

April 16, 2026
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
TAGS
Baby benefits body brain cancer care Day Diet disease exercise finds Fitness food Guide health healthy heart Improve Life Loss Men mental Natural Nutrition Patients People Pregnancy research reveals risk routine sex sexual Skin Skincare study Therapy Tips Top Training Treatment ways weight women Workout
About Us
About Us

Welcome to HealthTost, your trusted source for breaking health news, expert insights, and wellness inspiration. At HealthTost, we are committed to delivering accurate, timely, and empowering information to help you make informed decisions about your health and well-being.

Latest Articles

Baked Egg Muffin Cups with Vegetable Crust

April 17, 2026

Scientists warn of a silent rise in resistant Aspergillus and Candida

April 17, 2026

Clinical barriers hinder access to hormone therapy after cervical cancer treatment

April 16, 2026
New Comments
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 HealthTost. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.