Search for a command to run...
As software systems evolve, test suites tend to grow in size and often contain redundant test cases. Such redundancy increases testing effort, time, and cost. Test suite minimization (TSM) aims to eliminate such redundancy while preserving key properties, such as requirement coverage and fault-detection capability. In this paper, we propose RTM (Requirement coverage-guided Test suite Minimization), a novel TSM approach designed for requirement-based testing (validation), which can effectively reduce test suite redundancy while ensuring full requirement coverage and a high fault detection rate ( FDR ) under a fixed minimization budget. Based on common practice in critical systems where functional safety is important, we assume that test cases are specified in natural language and traced to requirements before implementation. RTM utilizes text embedding technique to convert test cases into vector representations, on which a distance function is employed to compute similarity values between test case pairs. Guided by these similarity values, a Genetic Algorithm (GA) whose population is initialized using a coverage-preserving strategy is then employed to search for an optimal subset of diverse test cases that matches the budget. We investigate three preprocessing methods for test cases, seven different text embedding techniques, three distance functions, and three initialization strategies for the GA. We evaluate RTM on an industrial automotive system dataset comprising \(736\) system test cases covering \(54\) requirements. Experimental results show that, while being scalable in terms of runtime, RTM outperforms all the baseline techniques in terms of FDR on most minimization budgets while maintaining full requirement coverage. Furthermore, we investigate the impact of test suite redundancy levels on the effectiveness of TSM, providing new insights into optimizing requirement-based test suites under practical constraints.