Search for a command to run...
Estradiol level assessment is critical for the management of gynecological endocrine disorders. The current gold standard, serum hormone testing, is limited by its invasiveness, variability due to physiological fluctuations, and limited accessibility in primary care settings. This study investigates the feasibility of using Multiple Instance Learning (MIL) algorithms for classification of estradiol levels using routinely collected Thinprep Cytologic Test (TCT) whole slide images (WSIs), which adds no additional procedure when TCT is already indicated. This retrospective study included 1,171 patient samples with paired TCT images and serum estradiol data. Samples were categorized into positive (n = 713) and negative (n = 458) groups using a threshold of 40 pg/mL. Each WSI was treated as a “bag,” with extracted image patches constituting “instances.” Under a weakly supervised MIL model, the binary classification performance of four models (AB-MIL, DS-MIL, Trans-MIL, and DTFD-MIL) for estradiol levels was systematically evaluated. A nested cross-validation with a fixed test set was used. We further conducted confounder analyses, multi-threshold validation, decision curve analysis, and a cross-batch generalizability test. AB-MIL achieved the highest performance: test accuracy 0.706 ± 0.048, macro-AUC 0.747 ± 0.024, macro-F1 0.692 ± 0.043. PR-AUC was 0.812. At the 0.5 operating point, sensitivity was 83.9%, specificity 64.1%. The combined clinical‑image model (age + FSH+AB‑MIL) achieved an AUC of 0.787 (NRI = 0.123, IDI = 0.0976). Multi-threshold (60,80,100 pg/mL) AUCs remained stable (0.679–0.710). Cross-batch testing revealed expected performance decay (AUC 0.611–0.631). This exploratory study demonstrates the technical feasibility of using an MIL-based model to infer estradiol status from cervical cell morphology in TCT WSIs. The findings provide preliminary evidence that AI analysis of routine cytology images could serve as a potential screening adjunct tool. This work presents a methodological framework for future development of endocrine assessment using routinely collected TCT samples, pending further validation and refinement, to be potentially integrated with cervical cancer screening workflows for concurrent evaluation.