Search for a command to run...
Epidermal growth factor receptor (EGFR) mutations are pivotal molecular drivers in lung adenocarcinoma (LUAD) with significant therapeutic implications, yet conventional molecular testing remains costly, time-consuming, and limited by tissue availability. This study aimed to develop and validate a pathology-based predictive model that integrates deep learning and machine learning to identify EGFR mutation status directly from hematoxylin and eosin (H&E)-stained whole-slide images (WSIs). A total of 268 pathologically confirmed LUAD cases were retrospectively included and randomly divided into training and testing cohorts at a 7:3 ratio. WSIs were partitioned into tiles, stain-normalized, and were subsequently encoded using multiple deep learning backbones, including DenseNet201, ResNet50, MobileNetV3, VGG, and Vision Transformer. Patch-level features were aggregated into slide-level representations via an attention-based multiple instance learning (MIL) framework. After feature selection with Lasso regression, different machine learning classifiers were constructed. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration analysis, and decision curve analysis (DCA). In the independent testing set, the MIL-DenseNet201 combined with logistic regression achieved the best performance (AUC = 0.885, 95% CI: 0.797–0.952; accuracy 82.7%; sensitivity 75.8%; specificity 87.5%), outperforming mean pooling and other backbone-based models. Calibration curves showed strong agreement between predicted and observed outcomes, while DCA demonstrated greater net clinical benefit compared with benchmark models. Moreover, attention heatmaps provided a qualitative visualization of regions contributing to EGFR mutation prediction. An attention-based MIL framework applied to routine H&E-stained WSIs demonstrated robust performance in predicting EGFR mutation status in LUAD, suggesting its potential as a scalable adjunct to molecular testing. Further validation in larger, multicenter cohorts is warranted to confirm its clinical utility and facilitate translation into practice.