Friday, August 15, 2025
Science
No Result
View All Result
  • Login
  • HOME
  • SCIENCE NEWS
  • CONTACT US
  • HOME
  • SCIENCE NEWS
  • CONTACT US
No Result
View All Result
Scienmag
No Result
View All Result
Home Science News Medicine

Mass General Brigham research identifies pitfalls and opportunities for generative artificial intelligence in patient messaging systems

April 24, 2024
in Medicine
Reading Time: 5 mins read
0
Danielle Bitterman, MD
66
SHARES
597
VIEWS
Share on FacebookShare on Twitter
ADVERTISEMENT
ADVERTISEMENT

A new study by investigators from Mass General Brigham demonstrates that large language models (LLMs), a type of generative AI, may help reduce physician workload and improve patient education when used to draft replies to patient messages. The study also found limitations to LLMs that may affect patient safety, suggesting that vigilant oversight of LLM-generated communications is essential for safe usage. Findings, published in Lancet Digital Health, emphasize the need for a measured approach to LLM implementation.

Danielle Bitterman, MD

Credit: Mass General Brigham

  • Study found GPT-4-generated messages to patients were acceptable without any additional physician editing 58% of the time and provided more detailed educational information than those written by physicians
  • AI-generated messages had shortcomings, including 7% of responses being deemed unsafe if left unedited
  • Generative AI may promote efficiency and patient education, but require a “doctor in the loop” and a cautious approach as hospitals integrate algorithms into electronic health records

A new study by investigators from Mass General Brigham demonstrates that large language models (LLMs), a type of generative AI, may help reduce physician workload and improve patient education when used to draft replies to patient messages. The study also found limitations to LLMs that may affect patient safety, suggesting that vigilant oversight of LLM-generated communications is essential for safe usage. Findings, published in Lancet Digital Health, emphasize the need for a measured approach to LLM implementation.

Rising administrative and documentation responsibilities have contributed to increases in physician burnout. To help streamline and automate physician workflows, electronic health record (EHR) vendors have adopted generative AI algorithms to aid clinicians in drafting messages to patients; however, the efficiency, safety and clinical impact of their use had been unknown.

“Generative AI has the potential to provide a ‘best of both worlds’ scenario of reducing burden on the clinician and better educating the patient in the process,” said corresponding author Danielle Bitterman, MD, a faculty member in the Artificial Intelligence in Medicine (AIM) Program at Mass General Brigham and a physician in the Department of Radiation Oncology at Brigham and Women’s Hospital. “However, based on our team’s experience working with LLMs, we have concerns about the potential risks associated with integrating LLMs into messaging systems. With LLM-integration into EHRs becoming increasingly common, our goal in this study was to identify relevant benefits and shortcomings.”

For the study, the researchers used OpenAI’s GPT-4, a foundational LLM, to generate 100 scenarios about patients with cancer and an accompanying patient question. No questions from actual patients were used for the study. Six radiation oncologists manually responded to the queries; then, GPT-4 generated responses to the questions. Finally, the same radiation oncologists were provided with the LLM-generated responses for review and editing. The radiation oncologists did not know whether GPT-4 or a human had written the responses, and in 31% of cases, believed that an LLM-generated response had been written by a human.

On average, physician-drafted responses were shorter than the LLM-generated responses. GPT-4 tended to include more educational background for patients but was less directive in its instructions. The physicians reported that LLM-assistance improved their perceived efficiency and deemed the LLM-generated responses to be safe in 82.1 percent of cases and acceptable to send to a patient without any further editing in 58.3 percent of cases. The researchers also identified some shortcomings: If left unedited, 7.1 percent of LLM-generated responses could pose a risk to the patient and 0.6 percent of responses could pose a risk of death, most often because GPT-4’s response failed to urgently instruct the patient to seek immediate medical care.

Notably, LLM-generated/physician-edited responses were more similar in length and content to LLM-generated responses versus the manual responses. In many cases, physicians retained LLM-generated educational content, suggesting that they perceived it to be valuable. While this may promote patient education, the researchers emphasize that overreliance on LLMs may also pose risks, given their demonstrated shortcomings.

The emergence of AI tools in health has the potential to positively reshape the continuum of care and it is imperative to balance their innovative potential with a commitment to safety and quality. Mass General Brigham is leading the way in responsible use of AI, conducting rigorous research on new and emerging technologies to inform the incorporation of AI into care delivery, workforce support and administrative processes. Mass General Brigham is currently leading a pilot integrating generative AI into the electronic health record to draft replies to patient portal messages, testing the technology in a set of ambulatory practices across the health system. 

Going forward, the study’s authors are investigating how patients perceive LLM-based communications and how patients’ racial and demographic characteristics influence LLM-generated responses, based on known algorithmic biases in LLMs.

“Keeping a human in the loop is an essential safety step when it comes to using AI in medicine, but it isn’t a single solution,” Bitterman said. “As providers rely more on LLMs, we could miss errors that could lead to patient harm. This study demonstrates the need for systems to monitor the quality of LLMs, training for clinicians to appropriately supervise LLM output, more AI literacy for both patients and clinicians, and on a fundamental level, a better understanding of how to address the errors that LLMs make.”

 

Authorship: Mass General Brigham co-authors include first author Shan Chen, MS, and Marco Guevara, Frank Hoebers, Benjamin Kann, Hugo Aerts and Raymond Mak of the AIM Program at Mass General Brigham and the Department of Radiation Oncology at Brigham and Women’s Hospital/Dana-Farber Cancer Institute, and Shalini Moningi, Hesham Elhalawani, Fallon Chipidza, and Jonathan Leeman (Brigham and Women’s Hospital). Additional co-authors include Timothy Miller, Guergana Savova, Jack Gallifant, Leo Celi, Maryam Lustberg, and Majid Afshar.

Disclosures: Bitterman is an Associate Editor of Radiation Oncology, HemOnc.org and receives funding from the American Association for Cancer Research. A complete list of disclosures is included in the paper.
Funding: Bitterman received financial support for this work from the National Institutes of Health (U54CA274516-01A1). Bitterman also received financial support from the Woods Foundation. A complete list of funding sources is included in the paper.

Paper cited: Chen, S et al. “The impact of using a large language model to respond to patient messages” Lancet Digital Health DOI: 10.1016/S2589-7500(24)00060-8/

###

About Mass General Brigham

Mass General Brigham is an integrated academic health care system, uniting great minds to solve the hardest problems in medicine for our communities and the world. Mass General Brigham connects a full continuum of care across a system of academic medical centers, community and specialty hospitals, a health insurance plan, physician networks, community health centers, home care, and long-term care services. Mass General Brigham is a nonprofit organization committed to patient care, research, teaching, and service to the community. In addition, Mass General Brigham is one of the nation’s leading biomedical research organizations with several Harvard Medical School teaching hospitals. For more information, please visit massgeneralbrigham.org.



Journal

The Lancet Digital Health

DOI

10.1016/S2589-7500(24)00060-8/

Method of Research

Computational simulation/modeling

Subject of Research

People

Article Title

The effect of using a large language model to respond to patient messages

Article Publication Date

24-Apr-2024

COI Statement

Dr. Bitterman is an Associate Editor of Radiation Oncology, HemOnc.org and receives funding from the American Association for Cancer Research. A complete list of disclosures is included in the paper

Share26Tweet17
Previous Post

Early trauma cuts life short for squirrels, and climate change could make matters worse

Next Post

Sustainable clean future possible with innovative high-energy-density capacitors

Related Posts

blank
Medicine

Plug-and-Play System Boosts Streptomyces Metabolite Production

August 15, 2025
blank
Medicine

Obesity Patients’ Struggles Seeking Support Uncovered

August 15, 2025
blank
Medicine

New gE-Fc Subunit Vaccine Shows Safe, Effective Protection

August 15, 2025
blank
Medicine

Minimally Invasive Procedure Eases Painful Symptoms of Knee Osteoarthritis

August 15, 2025
blank
Medicine

How AI is Accelerating the Development of RNA Vaccines and Therapies

August 15, 2025
blank
Medicine

Patient-Specific Flow Analysis Reveals Artery Dissection

August 15, 2025
Next Post
Sustainable clean future possible with innovative high-energy-density capacitors

Sustainable clean future possible with innovative high-energy-density capacitors

  • Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    27533 shares
    Share 11010 Tweet 6881
  • University of Seville Breaks 120-Year-Old Mystery, Revises a Key Einstein Concept

    947 shares
    Share 379 Tweet 237
  • Bee body mass, pathogens and local climate influence heat tolerance

    641 shares
    Share 256 Tweet 160
  • Researchers record first-ever images and data of a shark experiencing a boat strike

    507 shares
    Share 203 Tweet 127
  • Warm seawater speeding up melting of ‘Doomsday Glacier,’ scientists warn

    310 shares
    Share 124 Tweet 78
Science

Embark on a thrilling journey of discovery with Scienmag.com—your ultimate source for cutting-edge breakthroughs. Immerse yourself in a world where curiosity knows no limits and tomorrow’s possibilities become today’s reality!

RECENT NEWS

  • Targeted Snow Monitoring Enhances Water Supply Forecasts
  • One in Three U.S. Adults Unaware of HPV’s Link to Cancer
  • Plug-and-Play System Boosts Streptomyces Metabolite Production
  • Obesity Patients’ Struggles Seeking Support Uncovered

Categories

  • Agriculture
  • Anthropology
  • Archaeology
  • Athmospheric
  • Biology
  • Bussines
  • Cancer
  • Chemistry
  • Climate
  • Earth Science
  • Marine
  • Mathematics
  • Medicine
  • Pediatry
  • Policy
  • Psychology & Psychiatry
  • Science Education
  • Social Science
  • Space
  • Technology and Engineering

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 4,859 other subscribers

© 2025 Scienmag - Science Magazine

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • HOME
  • SCIENCE NEWS
  • CONTACT US

© 2025 Scienmag - Science Magazine

Discover more from Science

Subscribe now to keep reading and get access to the full archive.

Continue reading