A hybrid SMOTE and Gaussian mixture model based optimized XGBoost framework for bipolar disorder detection

20260 citationsJournal Articlegold Open Access

Authors

Santosh Kumar · Galgotias University

Deeksha Kumari · Galgotias University

Arvind Panwar · Galgotias University

Shrddha Sagar · Galgotias University

Lukas Herout · Škoda Auto University

Hamidreza Namazi · Monash University Malaysia

Nitesh Singh Bhati · Gautam Buddha University

Abstract

The identification of bipolar disorder (BD), a severe psychiatric condition characterized by recurrent mood fluctuations, remains challenging due to substantial inter-individual variability, symptom overlap with other mental disorders, and imbalanced clinical data. Delayed or inaccurate diagnosis often leads to inappropriate treatment strategies and adverse clinical outcomes, highlighting the need for reliable, data-driven decision-support tools. In this study, we propose a robust hybrid machine learning framework that integrates class balancing, latent subgroup discovery, and ensemble learning to improve the accuracy and consistency of BD identification from tabular clinical data. The framework applies the Synthetic Minority Over-sampling Technique (SMOTE) exclusively to the training data to address class imbalance, followed by Gaussian Mixture Model (GMM) based clustering to uncover latent patient subgroups and generate informative probabilistic features. These enriched features are subsequently used to train an optimized Extreme Gradient Boosting (XGBoost) classifier. Experimental evaluation on an independent test set demonstrates that the proposed model achieves 93% accuracy, 97% sensitivity (recall), 93% precision, 95% F1-score, and 79% specificity. When evaluated under identical experimental conditions, the proposed framework consistently outperforms baseline classifiers, including Support Vector Machine, Decision Tree, Logistic Regression, and Random Forest, with performance improvements ranging from 6 to 12%, depending on the comparator. The results indicate that combining SMOTE-based data balancing, GMM-driven latent feature enrichment, and gradient-boosted decision trees yields a scalable, interpretable, and clinically relevant decision-support system. This study supports the adoption of hybrid, data-driven approaches for early BD screening and personalized treatment planning in psychiatric healthcare settings.

Topics & Keywords

Digital Mental Health Interventions Bipolar Disorder and Treatment Face recognition and analysis

Publication Details

Published in: Scientific Reports

DOI: 10.1038/s41598-026-39104-3

Field-Weighted Citation Impact: 0.00

Command Palette

A hybrid SMOTE and Gaussian mixture model based optimized XGBoost framework for bipolar disorder detection

Authors

Abstract

Topics & Keywords

Publication Details