Conformal Prediction and Verification of Large LanguageModel Extractions in EHR Data

20250 citationsJournal Articlediamond Open Access

Authors

Vicki Seyfert-Margolis · MDxHealth (Netherlands)

Abstract

While Electronic Health Records (EHRs) promise comprehensive documentation of patient care, in reality there are significant challenges in data reliability and utilization. EHRs contain vast amounts of unstructured clinical narratives that, despite containing critical and relevant medical information, remain difficult to systematically extract and verify. Recent advances in large language models (LLMs) offer increasingly improving capabilities for extracting structured information from clinical notes, yet these approaches raise fundamental questions about output reliability, over-confident token predictions, and provide no guarantees (statistical or otherwise) for downstream clinical applications. In this work, we present a conformal verification framework for unstructured EHR data extraction using generative AI. While LLMs have increasingly impressive capabilities, they are notoriously miscalibrated and overconfident in their predictions, necessitating rigorous verification methods to eliminate the need to trust AI models. Our approach (i) employs LLMs to extract medical entities and concepts from clinical narratives with LLM-as-a-judge verification, (ii) implements probabilistic calibration to quantify extraction confidence, and (iii) applies conformal prediction to provide finite-sample guarantees on error rates for accepted extractions. We evaluate our framework on 10k clinical visits across 898 clinical practices utilizing three different EHR systems. Our conformal verification approach can provide assurances that the future expected proportion of accepted but incorrect extractions remains below a pre-specified risk level with rigorous statistical verification. It also maintains formal guarantees over clinical data quality, and illuminates the miscalibrations present in state-of-the-art LLM models, requiring additional validation for safe deployment of automated extraction systems.

Topics & Keywords

Machine Learning in Healthcare Electronic Health Records Systems Artificial Intelligence in Healthcare and Education

Publication Details

Published in: Proceedings of the AAAI Symposium Series

Volume 7, Issue 1, pp. 539-546

DOI: 10.1609/aaaiss.v7i1.36929

Field-Weighted Citation Impact: 0.00

Command Palette

Conformal Prediction and Verification of Large LanguageModel Extractions in EHR Data

Authors

Abstract

Topics & Keywords

Publication Details