Deep learning framework for predicting EGFR mutation status from H&E whole slide images in lung adenocarcinoma

20260 citationsJournal Articlegold Open Access

Authors

Weiwei Shao · Xuzhou Medical College

Wenyue Gu · Yancheng Third People's Hospital

Shlee Song · Shanghai Public Health Clinical Center

Danyi Chen · Xuzhou Medical College

Zhongming Shao · Xuzhou Medical College

Qingzeng Sun · Xuzhou Medical College

Hong Yu ·

Abstract

Epidermal growth factor receptor (EGFR) mutations are pivotal molecular drivers in lung adenocarcinoma (LUAD) with significant therapeutic implications, yet conventional molecular testing remains costly, time-consuming, and limited by tissue availability. This study aimed to develop and validate a pathology-based predictive model that integrates deep learning and machine learning to identify EGFR mutation status directly from hematoxylin and eosin (H&E)-stained whole-slide images (WSIs). A total of 268 pathologically confirmed LUAD cases were retrospectively included and randomly divided into training and testing cohorts at a 7:3 ratio. WSIs were partitioned into tiles, stain-normalized, and were subsequently encoded using multiple deep learning backbones, including DenseNet201, ResNet50, MobileNetV3, VGG, and Vision Transformer. Patch-level features were aggregated into slide-level representations via an attention-based multiple instance learning (MIL) framework. After feature selection with Lasso regression, different machine learning classifiers were constructed. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration analysis, and decision curve analysis (DCA). In the independent testing set, the MIL-DenseNet201 combined with logistic regression achieved the best performance (AUC = 0.885, 95% CI: 0.797–0.952; accuracy 82.7%; sensitivity 75.8%; specificity 87.5%), outperforming mean pooling and other backbone-based models. Calibration curves showed strong agreement between predicted and observed outcomes, while DCA demonstrated greater net clinical benefit compared with benchmark models. Moreover, attention heatmaps provided a qualitative visualization of regions contributing to EGFR mutation prediction. An attention-based MIL framework applied to routine H&E-stained WSIs demonstrated robust performance in predicting EGFR mutation status in LUAD, suggesting its potential as a scalable adjunct to molecular testing. Further validation in larger, multicenter cohorts is warranted to confirm its clinical utility and facilitate translation into practice.

Topics & Keywords

Radiomics and Machine Learning in Medical Imaging AI in cancer detection Cell Image Analysis Techniques

Publication Details

Published in: BMC Cancer

DOI: 10.1186/s12885-026-15957-9

Field-Weighted Citation Impact: 0.00

Command Palette

Deep learning framework for predicting EGFR mutation status from H&amp;E whole slide images in lung adenocarcinoma

Authors

Abstract

Topics & Keywords

Publication Details

Deep learning framework for predicting EGFR mutation status from H&E whole slide images in lung adenocarcinoma