Search for a command to run...
Feature selection is essential for improving classification performance and reducing overfitting in high-dimensional learning tasks. However, conventional importance-based methods often suffer from instability, model bias, and sensitivity to threshold settings. To address these limitations, we propose EFSHB (Ensemble Feature Selection using Hierarchical Binning), a hybrid ensemble framework that integrates importance-based sorting, bin-level greedy evaluation, iterative hierarchical refinement, and union-based integration of model-wise selected features. At each iteration, five tree-based models independently perform bin-wise greedy selection, and their selected subsets are merged through a union operation to form the feature set for the next iteration. This iterative process progressively refines the feature space while mitigating model-specific bias and promoting robust predictive performance across heterogeneous models. EFSHB was evaluated on nine high-dimensional benchmark datasets, including biomedical gene-expression, synthetic, proteomics, and speech-feature data. Across all datasets, EFSHB achieved the highest or near-highest classification accuracy, outperforming traditional Greedy Feature Selection (GFS), binning-based GFS (GFSB), and hierarchical binning GFS (GFSHB). On average, EFSHB improved accuracy for all classifiers, achieving mean gains of 14.0% over GFS and 13.3% over GFSHB. EFSHB also provided balanced feature reduction by avoiding excessive feature retention while preserving complementary informative features identified across models. In terms of computational efficiency, EFSHB reduced average feature selection time from 266 min (GFS) to 11 min, corresponding to a 24-fold speed-up. These results demonstrate that EFSHB achieves robust predictive performance and high computational efficiency, making it suitable for diverse high-dimensional applications.