
Sign up to save your podcasts
Or


In this episode of Imaging Informatics Unplugged, Dr. Mana Moassefi (incoming Radiology resident at Mayo Clinic) shares insights from a groundbreaking multi-institutional study that evaluated the use of large language models (LLMs) to annotate radiology reports β without any model training.
Mana and her collaborators from institutions including Mayo Clinic, UCSF, Moffitt Cancer Center, Harvard Medical School, UC Irvine, and Emory used prompt engineering to guide commercial LLMs in extracting diagnostic labels from 3,000 radiology reports. The study found:
π§ LLMs outperformed traditional NLP due to better contextual understanding
π οΈ Prompt engineering was critical, often outperforming chat-based approaches
π Structured reporting improved accuracy, but radiologist variability still introduced challenges
π΅βπ« Hallucinations occurred, especially when LLMs were uncertain β leading to verbose, off-topic answers
π Consistency matters: AI may not be perfect, but itβs reliably imperfect β a surprising edge over human variability
We also explore future use cases, from structured patient-facing summaries to multi-agent AI workflows that could quantify uncertainty and improve communication across clinical systems.
π Donβt forget to like, subscribe, and hit the bell to stay updated on future episodes exploring the intersection of imaging, informatics, and AI.
π Learn more at:
π https://www.nagelsconsulting.com
π https://learn.nagelsconsulting.com
π https://www.linkedin.com/company/nagels-consulting
By Nagels ConsultingIn this episode of Imaging Informatics Unplugged, Dr. Mana Moassefi (incoming Radiology resident at Mayo Clinic) shares insights from a groundbreaking multi-institutional study that evaluated the use of large language models (LLMs) to annotate radiology reports β without any model training.
Mana and her collaborators from institutions including Mayo Clinic, UCSF, Moffitt Cancer Center, Harvard Medical School, UC Irvine, and Emory used prompt engineering to guide commercial LLMs in extracting diagnostic labels from 3,000 radiology reports. The study found:
π§ LLMs outperformed traditional NLP due to better contextual understanding
π οΈ Prompt engineering was critical, often outperforming chat-based approaches
π Structured reporting improved accuracy, but radiologist variability still introduced challenges
π΅βπ« Hallucinations occurred, especially when LLMs were uncertain β leading to verbose, off-topic answers
π Consistency matters: AI may not be perfect, but itβs reliably imperfect β a surprising edge over human variability
We also explore future use cases, from structured patient-facing summaries to multi-agent AI workflows that could quantify uncertainty and improve communication across clinical systems.
π Donβt forget to like, subscribe, and hit the bell to stay updated on future episodes exploring the intersection of imaging, informatics, and AI.
π Learn more at:
π https://www.nagelsconsulting.com
π https://learn.nagelsconsulting.com
π https://www.linkedin.com/company/nagels-consulting