Close Menu
Healthtost
  • News
  • Mental Health
  • Men’s Health
  • Women’s Health
  • Skin Care
  • Sexual Health
  • Pregnancy
  • Nutrition
  • Fitness
  • Recommended Essentials
What's Hot

Inside Susie Ma’s Makeup | Founder of Tropic – Tropic Skincare

February 6, 2026

Ja’Marr Chase Offseason Training: The Explosive Workouts Fueling NFL Elite Performance

February 6, 2026

Preoperative factors predict persistent opioid use after surgery

February 6, 2026
Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
Healthtost
SUBSCRIBE
  • News

    Preoperative factors predict persistent opioid use after surgery

    February 6, 2026

    AI-enabled stethoscope doubles detection of valvular heart disease

    February 5, 2026

    Gut microbial butyrate enhances mucosal vaccine antibody responses

    February 5, 2026

    Study identifies brain region that leads to visual learning

    February 4, 2026

    Unusual i-DNA structure that appears to regulate genes and cancer

    February 4, 2026
  • Mental Health

    Mental Health in the Black Community: Addressing…

    February 3, 2026

    Some people gain confidence when they think things through, others lose it – new research

    February 2, 2026

    3 practical ways to improve a writer’s mental health

    January 31, 2026

    Your phone is not a weakness. It’s a distraction machine. Here’s how to regain your focus.

    January 25, 2026

    Find out how you can support people with eating and substance use disorders

    January 24, 2026
  • Men’s Health

    Air conditioning in nursing homes reduces heat-related risk

    February 6, 2026

    Analysis: What it’s like to have non-verbal autism and what helped me

    February 5, 2026

    Testicular cancer self-examination and why it could save your life

    February 2, 2026

    25-Minute Bodyweight Functional Training Program for Beginners

    February 1, 2026

    Turning everyday eggs into powerful nutrient delivery systems

    January 30, 2026
  • Women’s Health

    Enjoying Endorphins: How to Spoil Your Mood with Feel-Good Hormones

    February 5, 2026

    A critical maternal health data system is at risk

    February 5, 2026

    Prenatal care in 2026: New recommendations for healthy pregnancy

    February 1, 2026

    3 Teens Quit Social Media for a Week — and Loved It

    February 1, 2026

    Exercises for Prevention, Symptoms & Recovery

    January 31, 2026
  • Skin Care

    Inside Susie Ma’s Makeup | Founder of Tropic – Tropic Skincare

    February 6, 2026

    5 Expert-Backed Tips on How to Reduce Forehead Wrinkles

    February 6, 2026

    5 Powerful Skincare Osmolytes (And Why Your Skin Loves Them)

    February 5, 2026

    Tranexamic Acid – Esthetic Approved Ingredient

    February 4, 2026

    Capable of creating warmth for every skin tone

    February 3, 2026
  • Sexual Health

    Adventurous intimacy is more common than you think — Alliance for Sexual Health

    February 5, 2026

    A guide to a comfortable cervical check with Dr. Unsworth

    February 1, 2026

    How “Bridgerton” and the Other Romances Evolved in Their Depictions of Consent

    January 30, 2026

    Extraction, gold mining and SRHR in Kenya

    January 29, 2026

    How the Wabi-Sabi Body Frame is Rewriting Body Image Therapy — Sexual Health Alliance

    January 28, 2026
  • Pregnancy

    The second trimester sweet spot is real. Here’s how to get the most out of it

    February 4, 2026

    Is it safe to drink milk during pregnancy? What to know

    January 31, 2026

    12 Expert Answers to Your Pregnancy Yoga Questions

    January 29, 2026

    Best Pregnancy and Postpartum Fitness Course 2026

    January 27, 2026

    The best baby travel products for visiting family

    January 26, 2026
  • Nutrition

    5 Ways You’re Sabotaging Your Metabolism

    February 2, 2026

    How to Save Money on Travel • Kath Eats

    February 1, 2026

    How low can LDL cholesterol go on PCSK9 inhibitors?

    January 31, 2026

    Signs that your body is ready to reset

    January 31, 2026

    Healthy Pakistani Recipes: Low-Oil Versions of Beloved Classics

    January 30, 2026
  • Fitness

    Ja’Marr Chase Offseason Training: The Explosive Workouts Fueling NFL Elite Performance

    February 6, 2026

    What’s NEW in February 2026 for the BODi Community of Experience!

    February 5, 2026

    AI As a Learning Coach – BionicOldGuy

    February 5, 2026

    Can your customers actually do what you want them to do? – Tony Gentilcore

    February 2, 2026

    7 Essential Mental Health Tips for Healthy Aging

    February 2, 2026
  • Recommended Essentials
Healthtost
Home»News»The study evaluates safety and accuracy in emergency medicine
News

The study evaluates safety and accuracy in emergency medicine

healthtostBy healthtostDecember 7, 2024No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
The Study Evaluates Safety And Accuracy In Emergency Medicine
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email

Study evaluates large language model for emergency medicine handover notes, finding high utility and safety comparable to physicians

Study: Development and Evaluation of Emergency Medical Emergency Management Notes Generated by the Large Language Model. Image credit: Kamon_wongnon / Shutterstock.com

In a recent study published in JAMA Network Openresearchers developed and evaluated the accuracy, safety, and utility of Emergency Medicine (EM)-generated Long Language Model (LLM) handoff notes to reduce physician documentation burden without compromising patient safety.

The critical role of transfers in health care

Handles are critical points of contact in healthcare and a known source of medical errors. As a result, many organizations, such as The Joint Commission and the Accreditation Council for Graduate Medical Education (ACGME), have advocated standardized procedures to improve safety.

EM to inpatient (IP) transfers are associated with unique challenges, such as medical complexity, time constraints, and diagnostic uncertainty. However, they remain poorly standardized and inconsistently implemented. Electronic health record (EHR)-based tools have attempted to overcome these limitations. However, they remain unexplored in emergency situations.

LLMs have emerged as potential solutions for streamlining clinical documentation. However, concerns about factual inconsistencies require further research to ensure safety and reliability in critical workflows.

About the study

The present study was conducted in an 840-bed urban tertiary care academic hospital in New York City. EHR data from 1,600 EM patient encounters resulting in acute hospital admissions between April and September 2023 were analyzed. Only encounters after April 2023 were included due to the implementation of an updated EM-to-IP handover system.

Retrospective data were used with a waiver of informed consent to ensure minimal risk to patients. Handoff notes were created using a combination of LLM detail and rule-based heuristics while adhering to standard reference guidelines.

The delivery note template closely resembled the current structure of the manual, incorporating rule-based elements such as laboratory tests and vital signs and LLM-generated elements such as history of present illness and differential diagnoses. IT experts and EM physicians curated data to refine the LLM to improve their quality, while excluding race-based characteristics to avoid bias.

Two LLMs, robust Optimized Bidirectional Encoder Representations by Transformers Approach (RoBERTa) and Large Language Model Meta AI (Llama-2), were used for meaningful content selection and abstract summarization, respectively. Data processing included heuristic prioritization and saliency modeling to address potential limitations of the models.

The researchers evaluated automated metrics, such as the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) and the Bidirectional Encoder Representations from Transformers Score (BERTScore), alongside a new framework focused on patient safety. A clinical review of 50 delivery notes assessed their completeness, readability and safety to ensure their rigorous validation.

Study findings

Among the 1,600 patient cases included in the analysis, the mean age was 59.8 years with a standard deviation of 18.9 years, and 52% of patients were female. Automated evaluation metrics revealed that LLM-generated summaries outperformed those written by physicians in many aspects.

ROUGE-2 scores were significantly higher for LLM-generated summaries compared to physician summaries at 0.322 and 0.088, respectively. Similarly, BERT accuracy scores were higher at 0.859 compared to 0.796 for physician summaries. In contrast, the source segmentation approach for large-scale inconsistency assessment (SCALE) produced a score of 0.691 compared to 0.456. These results indicate that LLM-generated summaries demonstrated greater lexical similarities, higher fidelity to source notes, and provided more detailed content than their human-generated counterparts.

In clinical evaluations, the quality of LLM-generated summaries was comparable to physician-written summaries, but slightly inferior on several dimensions. On a Likert scale of one to five, LLM-generated summaries scored lower on usefulness, completeness, curation, readability, correctness, and patient safety. Despite these differences, the automated summaries were generally considered acceptable for clinical use, with none of the identified issues identified as life-threatening for patient safety.

When assessing worst-case scenarios, clinicians identified potential second-level safety risks, which included incomplete and flawed logic in 8.7% and 7.3%, respectively, for LLM-generated summaries compared to written summaries by doctors, which were not associated with these risks. Hallucinations were rare in LLM-generated summaries, with five identified cases all receiving safety scores between four and five, thus indicating mild to negligible safety risks. Overall, LLM-generated notes had a higher inaccuracy rate at 9.6% compared to written physician notes at 2%, although these inaccuracies rarely involved significant safety implications.

Interrater reliability was calculated using intraclass correlation coefficients (ICC). The ICCs showed good agreement between the three expert raters for completeness, diligence, correctness, and utility at 0.79, 0.70, 0.76, and 0.74, respectively. Readability achieved fair reliability with an ICC of 0.59.

conclusions

The current study successfully generated EM-to-IP handoff notes using a refined LLM and rule-based approach within a user-developed template.

Traditional automated assessments were associated with superior LLM performance. However, manual clinical assessments revealed that although most LLM-generated notes achieved promising quality scores between four and five, they were generally inferior to physician written notes. Detected errors, including incompleteness and faulty logic, occasionally pose moderate security risks, with less than 10% causing significant problems compared to doctor’s notes.

Journal Reference:

  • Hartman, V., Zhang, X., Poddar, R., et al. (2024). Development and Evaluation of Emergency Medical Emergency Management Notes Generated by the Large Language Model. JAMA Network Open. doi:10.1001/jamanetworkopen.2024.48723
accuracy emergency evaluates Medicine safety study
bhanuprakash.cg
healthtost
  • Website

Related Posts

Preoperative factors predict persistent opioid use after surgery

February 6, 2026

AI-enabled stethoscope doubles detection of valvular heart disease

February 5, 2026

Gut microbial butyrate enhances mucosal vaccine antibody responses

February 5, 2026

Leave A Reply Cancel Reply

Don't Miss
Skin Care

Inside Susie Ma’s Makeup | Founder of Tropic – Tropic Skincare

By healthtostFebruary 6, 20260

For Suzy, beauty for a big occasion is not to become someone else. Of for…

Ja’Marr Chase Offseason Training: The Explosive Workouts Fueling NFL Elite Performance

February 6, 2026

Preoperative factors predict persistent opioid use after surgery

February 6, 2026

Air conditioning in nursing homes reduces heat-related risk

February 6, 2026
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
TAGS
Baby benefits body brain cancer care Day Diet disease exercise finds Fitness food Guide health healthy heart Improve Life Loss Men mental Natural Nutrition Patients People Pregnancy protein research reveals risk routine sex sexual Skin study Therapy Tips Top Training Treatment ways weight women Workout
About Us
About Us

Welcome to HealthTost, your trusted source for breaking health news, expert insights, and wellness inspiration. At HealthTost, we are committed to delivering accurate, timely, and empowering information to help you make informed decisions about your health and well-being.

Latest Articles

Inside Susie Ma’s Makeup | Founder of Tropic – Tropic Skincare

February 6, 2026

Ja’Marr Chase Offseason Training: The Explosive Workouts Fueling NFL Elite Performance

February 6, 2026

Preoperative factors predict persistent opioid use after surgery

February 6, 2026
New Comments
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 HealthTost. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.