Search for a command to run...
Multiple Sclerosis (MS) diagnosis and progression prediction remain challenging due to the scarcity of labeled medical data. This paper presents a robust pipeline that effectively extracts meaningful lesion features from MRI scans and employs machine learning (ML) techniques to predict the Expanded Disability Status Scale (EDSS) scores. The pipeline is a specific feature engineering pipeline designed to maximize the predictive value extracted from standard MRI scans, particularly in such low-data scenarios. Our primary contribution is the novel integration of existing methods: we first generate a 3D brain mesh from lesion masks using the Marching Cubes algorithm, then apply Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering directly to the mesh vertices to identify distinct lesion clusters, and finally quantify these clusters by calculating their Lesion Volume to Region Volume Ratio against a standardized brain atlas. This approach captures fine-grained spatial lesion characteristics often missed by simpler, global metrics. To address the inherent class imbalance in these small datasets, we employed the Synthetic Minority Over-sampling Technique (SMOTE) in our model training. Various ML models, including Support Vector Machines (SVM), Decision Trees, Multilayer Perceptron (MLP), and Boosting algorithms (Random Forest, XGBoost, AdaBoost), were assessed. When tested on a public dataset of 60 patients, a simple MLP achieved an accuracy of 87% on a dataset with only 60 patients trained on our extracted features, demonstrating the effectiveness of our pipeline in low-data scenarios. In conclusion, our study successfully demonstrates the potential of this feature pipeline. However, its clinical applicability must be established through future validation on larger, multi-center datasets.
Published in: International Journal of Computational Intelligence Systems