Search for a command to run...
High-dimensional structured datasets are common in fields such as semiconductor manufacturing, healthcare, and finance, where redundant and irrelevant features often increase computational cost and reduce predictive accuracy. Feature selection mitigates these issues by identifying a compact, informative subset of features, enhancing model efficiency, performance, and interpretability. This study proposes Maximal Information Coefficient–Simplified Swarm Optimization (MIC-SSO), a two-stage hybrid feature selection method that combines the MIC as a filter with SSO as a wrapper. In Stage 1, MIC ranks feature relevance and removes low-contribution features; in Stage 2, SSO searches for an optimal subset from the reduced feature space using a fitness function that integrates the Matthews Correlation Coefficient (MCC) and feature reduction rate to balance accuracy and compactness. Experiments on five public datasets compare MIC-SSO with multiple hybrid, heuristic, and literature-reported methods, with results showing superior predictive accuracy and feature compression. The method’s ability to outperform existing approaches in terms of predictive accuracy and feature compression underscores its broader significance, offering a powerful tool for data analysis in fields like healthcare, finance, and semiconductor manufacturing. Statistical tests further confirm significant improvements over competing approaches, demonstrating the method’s effectiveness in integrating the efficiency of filters with the precision of wrappers for high-dimensional tabular data analysis.