Close Menu
Healthtost
  • News
  • Mental Health
  • Men’s Health
  • Women’s Health
  • Skin Care
  • Sexual Health
  • Pregnancy
  • Nutrition
  • Fitness
What's Hot

Here’s the ACA Premium hikes

July 24, 2025

Prostate cancer and erectile dysfunction

July 24, 2025

Bicarb, magnesium and search for perfect Pit formula

July 24, 2025
Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
Healthtost
SUBSCRIBE
  • News

    Here’s the ACA Premium hikes

    July 24, 2025

    Coverage exceeds opponents in timely detection of covid mutations

    July 24, 2025

    Forever Chemicals Cross Placenta and breast milk that affect baby immunity

    July 23, 2025

    Targeting of tumor cell stem can keep the key to treating colon cancer more effectively

    July 23, 2025

    Aging skin buckles under pressure leading to wrinkles

    July 22, 2025
  • Mental Health

    How mothers who support mothers can help cover the lack of healthcare and other barriers to care

    July 22, 2025

    Do you have to trust a AI mental health application? -Poic details, privacy risks and 7 -point security checklist

    July 19, 2025

    3 ways Canadians can take control of their finances in a time of economic uncertainty

    July 18, 2025

    Exercise can significantly benefit the mental health of adolescents – here they say the items

    July 13, 2025

    Awareness Month for Mental Health 2025: Turn awareness into action

    July 9, 2025
  • Men’s Health

    Prostate cancer and erectile dysfunction

    July 24, 2025

    30 minutes of full body workout to burn fat and enhance strength

    July 23, 2025

    Erythritol changes brain function and may increase the risk of stroke

    July 21, 2025

    Cardio vs. Training Power: Which is better for shrinking medium -age fat?

    July 21, 2025

    New peak health technologies for all men over 40

    July 20, 2025
  • Women’s Health

    How do you treat the vagina? Effective, non-relief-Vuvatech, non-surgical options

    July 24, 2025

    Probiotics of Multiple Executives for Bowel, Skin and Energy Support

    July 23, 2025

    Power beyond the game: Vicky Fleetwood

    July 22, 2025

    Can you get magnesium with multivitamins and other vitamins?

    July 21, 2025

    I wasn’t tired. I was in heart failure.

    July 20, 2025
  • Skin Care

    Bicarb, magnesium and search for perfect Pit formula

    July 24, 2025

    All thermal flx | About aesthetics

    July 24, 2025

    The bridal flash guide with Joanna Vargas

    July 22, 2025

    Think that your sunscreen protects you? New study probably says no

    July 21, 2025

    Your Guide to Resources: both large and small

    July 20, 2025
  • Sexual Health

    How to try HIV in Australia: Free, Fast and Private

    July 21, 2025

    Do orgasms change over time?

    July 21, 2025

    7 gender myths collapsing by a special fertility for couples

    July 19, 2025

    New Jersey’s ban on book bans

    July 18, 2025

    I’m Trans Teen. The US government is attacking my community.

    July 18, 2025
  • Pregnancy

    67 Perfect Baby Book Inscriptions

    July 24, 2025

    Restore your week with these Storms-Rose Stork

    July 22, 2025

    Why French baby names tend to modern mothers

    July 21, 2025

    Last minute baby gifts that still join each mom

    July 17, 2025

    How to avoid activation and manage it?

    July 16, 2025
  • Nutrition

    45 Vegetable Summer Picnic Recipes

    July 23, 2025

    Episode 007: The Power of Critical Thinking: Why Success requires Brave Options with Sean Croxton

    July 22, 2025

    Do you need a glucose screen if you don’t have diabetes?

    July 22, 2025

    Do you have a dessert? Here is 5 natural GLP-1 foods for dessert

    July 21, 2025

    Grammie + Pea Camp 2025 • Kath eats

    July 20, 2025
  • Fitness

    Jacksonville Hiking Trails: Fresh Air & Fun for all

    July 23, 2025

    My healthy stack of sleep: what I use for deep, restorative rest

    July 23, 2025

    New Dumbbell training for beginners (plus my favorite exercises 💪)

    July 22, 2025

    10 healthy ways to launch steam

    July 22, 2025

    10 high -protein breakfast ideas for weight loss

    July 21, 2025
Healthtost
Home»News»The impressive diagnostic skills of GPT-4 were demonstrated
News

The impressive diagnostic skills of GPT-4 were demonstrated

healthtostBy healthtostApril 21, 2024No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
The Impressive Diagnostic Skills Of Gpt 4 Were Demonstrated
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email

In a recent study published in the journal PLOS Digital Healthresearchers assessed and compared the clinical knowledge and diagnostic reasoning capabilities of large language models (LLMs) with those of human ophthalmology experts.

Study: Large language models approximate expert-level clinical knowledge and reasoning in ophthalmology: A cross-sectional study. Image credit: ozrimoz / Shutterstock

Record

Generative Pre-trained Transformers (GPTs), GPT-3.5 and GPT-4, are advanced language models trained on massive Internet-based datasets. They power ChatGPT, a conversational artificial intelligence (AI) noted for its medical application success. Despite previous models struggling in specialized medical tests, the GPT-4 shows significant advances. Concerns remain about data “contamination” and the clinical relevance of test scores. Further research is needed to validate the clinical applicability and safety of language models in real medical settings and to address existing limitations in their expert knowledge and reasoning abilities.

About the study

The questions for the Part 2 examination of the Fellowship of the Royal College of Ophthalmologists (FRCOphth) have been extracted from a specialist manual that is not widely available online, minimizing the likelihood of these questions appearing in LLM training data. A total of 360 multiple-choice questions spanning six chapters were extracted and a pool of 90 questions was isolated for a mock examination used to compare the performance of LLMs and MDs. Two researchers aligned these questions with the categories established by the Royal College of Ophthalmologists and classified each question according to Bloom’s levels of cognitive processes. Questions with non-text elements that were unsuitable for LLM entry were excluded.

Exam questions were entered into versions of ChatGPT (GPT-3.5 and GPT-4) to collect responses, repeating the process up to three times per question where necessary. Once other models such as Bard and HuggingChat became available, similar tests were conducted. Correct answers, as defined by the textbook, were noted for comparison.

Five specialist ophthalmologists, three intern ophthalmologists and two general practitioners independently completed the virtual examination to assess the practical application of the models. Their answers were then compared with the answers of the LLMs. After testing, these ophthalmologists rated the LLMs’ responses using a Likert scale to rate accuracy and relevance, without knowing which model provided which response.

The statistical design of this study was powerful enough to detect significant performance differences between LLMs and human doctors, aiming to test the null hypothesis that both would perform similarly. Various statistical tests, including chi-squared and paired t-tests, were applied to compare performance and assess the consistency and reliability of LLM responses versus human responses.

Study results

Of the 360 ​​questions contained in the FRCOphth Part 2 exam manual, 347 were selected for use, including 87 from the mock exam chapter. The exceptions were mostly questions with pictures or tables, which were unsuitable for input into LLM interfaces.

Performance comparisons revealed that GPT-4 significantly outperformed GPT-3.5, with a correct response rate of 61.7% versus 48.41%. This advance in GPT-4 capabilities was consistent across different types of questions and subjects, as outlined by the Royal College of Ophthalmologists. Detailed results and statistical analyzes further confirmed the strong performance of GPT-4, making it a competitive tool even among other LLMs and human doctors, especially junior doctors and trainees.

Exam features and detailed performance data.  Topic and question type distributions are presented along with the scores achieved by LLMs (GPT-3.5, GPT-4, LLaMA and PalM 2), Ophthalmologists (E1-E5), Ophthalmologists (T1-T3) and non-ophthalmologists (T1-T3). specialized young doctors (J1- J2).  Median scores do not necessarily add up to the overall median score, as fractional scores are impossible.Exam features and detailed performance data. Topic and question type distributions are presented along with the scores achieved by LLMs (GPT-3.5, GPT-4, LLaMA and PalM 2), Ophthalmologists (E1-E5), Ophthalmologists (T1-T3) and non-ophthalmologists (T1-T3). specialized young doctors (J1- J2). Median scores do not necessarily add up to the overall median score, as fractional scores are impossible.

In the specially adapted 87-question mock exam, the GPT-4 not only outperformed among LLMs, but also scored comparably to ophthalmologists and significantly better than juniors and interns. Performance across different groups of participants showed that while specialist ophthalmologists maintained the highest accuracy, trainees approached these levels, far outperforming junior non-ophthalmology specialists.

Statistical tests also highlighted that agreement between responses given by different LLMs and human participants was generally low to moderate, indicating variation in reasoning and application of knowledge between groups. This was particularly evident when the differences in knowledge between the models and human doctors were compared.

A detailed examination of the mock questions against the actual exam standards showed that the mock setup closely mirrored the actual FRCOphth Part 2 Written Exam in difficulty and structure, as agreed upon by the ophthalmologists involved. This alignment ensured that the assessment of LLMs and human responses was based on a realistic and clinically relevant framework.

In addition, qualitative feedback from ophthalmologists confirmed a strong preference for GPT-4 over GPT-3.5, correlating with quantitative performance data. The higher accuracy and relevance scores for the GPT-4 highlighted its potential utility in clinical settings, particularly in ophthalmology.

Finally, an analysis of the cases where all LLMs failed to provide the correct answer showed no consistent patterns related to the complexity or subject matter of the questions.

demonstrated diagnostic GPT4 impressive skills
bhanuprakash.cg
healthtost
  • Website

Related Posts

Here’s the ACA Premium hikes

July 24, 2025

Coverage exceeds opponents in timely detection of covid mutations

July 24, 2025

Forever Chemicals Cross Placenta and breast milk that affect baby immunity

July 23, 2025

Leave A Reply Cancel Reply

Don't Miss
News

Here’s the ACA Premium hikes

By healthtostJuly 24, 20250

The host Julie Rovner Kff Health News @Jrovner @julierovner.bsky.social Julie Rovner is the head of…

Prostate cancer and erectile dysfunction

July 24, 2025

Bicarb, magnesium and search for perfect Pit formula

July 24, 2025

67 Perfect Baby Book Inscriptions

July 24, 2025
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
TAGS
Baby benefits body brain cancer care Day Diet disease exercise finds Fitness food Guide health healthy heart Improve Life Loss Men mental Natural Nutrition Patients Pregnancy protein research reveals Review risk routine sex sexual Skin study Therapy Tips Top Training Treatment ways weight women Workout
About Us
About Us

Welcome to HealthTost, your trusted source for breaking health news, expert insights, and wellness inspiration. At HealthTost, we are committed to delivering accurate, timely, and empowering information to help you make informed decisions about your health and well-being.

Latest Articles

Here’s the ACA Premium hikes

July 24, 2025

Prostate cancer and erectile dysfunction

July 24, 2025

Bicarb, magnesium and search for perfect Pit formula

July 24, 2025
New Comments
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2025 HealthTost. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.