Search for a command to run...
Cervical cancer is one of the leading causes of death in women, especially in low- and middle-income countries. Early disease detection is crucial for improving survival, but conventional methods are inefficient, costly, and error-prone. In this study, we present a hybrid machine learning framework that diagnoses cervical cancer risk from clinical and behavioral records by analyzing 36 patient attributes such as age, sexual history, smoking habit, hormonal contraceptive use, and sexually transmitted disease history. Missing values were imputed using a Generative Adversarial Imputation Network (GAIN). The Boruta algorithm was then used to identify the most influential diagnostic features, and Random Oversampling (ROS) was applied to correct class imbalance. Dimensional reduction techniques such as Independent Component Analysis (ICA) and Principal Component Analysis (PCA) were used. The diagnostic prediction is generated by a Bayesian Fusion Ensemble (XBFE) that combines outputs from Decision Tree and Random Forest models to estimate each patient’s likelihood of cervical cancer and systematically evaluates contributing risk factors. Principal determinants, including the Schiller test, the Hinselmann test, cytology, age, number of sexual partners, and smoking, were identified using Boruta feature selection. The proposed model achieved an accuracy of 99.88%, a recall of 1.00, and an AUC-ROC score of 1.00, as validated by K-fold cross-validation. To improve interpretability for healthcare clinicians, Explainable AI (XAI) tools such as SHAP and LIME were used. We developed a web-based application for real-time risk estimation. The proposed system provides a reliable and interpretable solution for predicting cervical cancer risk, helping doctors make better decisions, especially in resource-limited settings. human readable.