Search for a command to run...
Abstract The class imbalance problem significantly hinders the ability of the software defect prediction models to distinguish defective (minority class) software instances from non-defective (majority class) ones. Traditional defect prediction models’ undersampling techniques have drawbacks like information loss due to discarding majority class samples and data distribution alteration causing model bias. Recently, researchers have sought to address the imbalance problem by introducing techniques that weigh the characteristics of instances. Using these weighted characteristics, samples can be ranked according to their relevance, allowing the removal of those deemed less critical to the prediction task. However, current feature weighting approaches often focus on the significance of individual features, overlooking the overall contribution of samples with high feature sums, which may carry critical information for classification tasks and risk being disproportionately removed, thereby degrading performance. This limitation reflects a feature-centric perspective that treats feature importance independently and overlooks the collective contribution of instances to the predictive task. To address this challenge, this paper proposes a Learning-to-Rank Feature Weighting (LTRFW) framework, which shifts the focus from individual features to instances by integrating a Learning-to-Rank model with Differential Evolution (DE) for feature weight optimization. The DE algorithm optimizes the weight vector by combining grid search and Pearson’s correlation analysis to rank majority-class instances and perform selective undersampling, enabling the model to retain informative samples while effectively mitigating class imbalance. Experimental results on four classifiers and 21 datasets demonstrate that LTRFW consistently outperforms existing undersampling methods, improving the F1-score by 14.8% and the AUC by 21.9% on average compared to state-of-the-art baselines such as LTRUS and COSTE. Although the average Cliff’s $$\delta$$ is approximately 0.12, indicating a small effect size, its consistent statistical significance with $$p<0.05$$ demonstrates that the improvements achieved by LTRFW are robust, reproducible, and practically meaningful.
Published in: International Journal of Computational Intelligence Systems
Volume 19, Issue 1