Search for a command to run...
Thalassemia is the most prevalent single-gene disorder in Sri Lanka, imposing a significant socioeconomic and healthcare burden. Early carrier detection is essential for genetic counselling and the prevention of thalassemia major births. Current screening programs rely on red cell indices, manual examination of blood smears by expert haematologists, and expensive confirmatory tests such as High-Performance Liquid Chromatography (HPLC) or genetic analysis, which limit large-scale applicability in low-resource settings. This study aims to develop a machine learning–based automated screening tool that integrates Red Blood Cell (RBC) indices and blood smear image analysis to support cost-effective and scalable β-thalassemia trait detection. In a cross-sectional study of 152 individuals (54 confirmed β-thalassemia trait, 98 negative), 30% of the dataset was allocated for independent testing. The remaining 70% was used for model training, with 5-fold cross-validation for the RBC analysis model and 10-fold cross-validation for the image analysis model. RBC features were classified using a Multi-Layer Perceptron (MLP; scikit-learn), while blood smear images, captured through a conventional microscope using a smartphone camera, were analysed using a transfer learning–based VGG-16 CNN (TensorFlow/Keras). Data balancing and image augmentation (rotation, flipping, brightness variation) were applied to address class imbalance and overfitting. A two-step screening pipeline was proposed, applying RBC analysis first, followed by smear image analysis for RBC-negative cases. The RBC indices analysis model achieved 88.2% sensitivity (95% CI: 63.6–98.5%) and 92.9% specificity (95% CI: 76.5–99.1%), while the image analysis model reached 88.2% sensitivity (95% CI: 63.6–98.5%) and 64.3% specificity (95% CI: 44.1–81.4%). When integrated sequentially, the combined pipeline achieved 100% sensitivity (95% CI: 80.5–100%) and 60.7% specificity (95% CI: 40.6–78.5%), reducing the overall need for smear preparation by 37.7%. This enhanced dual-modal screening system provides a highly sensitive, potential to be cost-effective, and practical solution for β-thalassemia carrier detection, enabling mass screening and supporting sustainable prevention strategies in resource-limited settings. With larger, independent, multi-centre validation, it could be integrated into laboratory workflows to expand screening coverage, while ongoing evaluation improves generalisability and inform policy adoption. Not applicable.