Methodological insights on building and evaluating models for early warning of hypotension during surgery

20260 citationsJournal Article

Authors

Bob Aubouin–Pairault · Centre National de la Recherche Scientifique

Kaouther Moussa · Centre National de la Recherche Scientifique

Mazen Alamir · Institut polytechnique de Grenoble

Benjamin Meyer · Université Grenoble Alpes

Abstract

Background: Hypotension prediction has attracted considerable attention in the medical community, leading to numerous publications on the topic. Several data-driven models have been proposed, but framing, data selection, and evaluation metrics differ widely in the literature. Methods: Using datasets from non-cardiac and cardiac surgery and a forward framing, we assess how data selection affects model performance. We compare models trained and tested with or without segments containing ongoing hypotension at prediction time or interventions that could affect the classification of those hemodynamics segments. Model performances are evaluated through area under precision-recall curve (AUPRC), under receiver operator characteristic curve (AUROC), and a dedicated metric that better reflects the clinician questions. Results: The non-cardiac cohort contained 1,017 patients and the cardiac cohort 563. Across both datasets, model performance depended strongly on whether ongoing hypotension or classification-altering interventions were present in the evaluation data. For training, removing classification-altering interventions in the training data improved AUPRC (mean difference of 0.01 (95% CI, 0.007 to 0.012, bootstrap p<0.01)), while exclusion of ongoing hypotension did not change the AUPRC (mean difference of 0.000 (95% CI, -0.003 to 0.004)). In the cardiac set, which is only used for evaluation, filtering classification-altering interventions increased on average by 15.5% the AUPRC (0.54 (95% CI, 0.53 to 0.55) vs. 0.47 (95% CI, 0.45 to 0.48)) of the trained models considered. At the same time, including ongoing hypotension in evaluation data increased on average by 72.2% the AUPRC (0.47 (95% CI, 0.45 to 0.48) vs. 0.80 (95% CI, 0.76 to 0.84)). Conclusion: Data selection is critical when building and evaluating hypotension prediction models. For an evaluation that corresponds to the clinical requirement of a hypotension early warning, we recommend training models on datasets excluding classification-altering interventions, and testing on datasets excluding classification-altering interventions and ongoing hypotension.

Topics & Keywords

Hemodynamic Monitoring and Therapy Sepsis Diagnosis and Treatment Cardiac, Anesthesia and Surgical Outcomes

Publication Details

Published in: Anesthesiology

DOI: 10.1097/aln.0000000000006069

Field-Weighted Citation Impact: 0.00

Command Palette

Methodological insights on building and evaluating models for early warning of hypotension during surgery

Authors

Abstract

Topics & Keywords

Publication Details