Search for a command to run...
AI agents are increasingly integrated into software testing workflows, including test case generation, regression prioritization, execution analysis, and defect triage. While these capabilities can improve throughput and expand exploratory coverage, their adoption in release-relevant contexts remains methodologically undergoverned. Current studies predominantly evaluate task-level utility of model outputs, but provide limited guidance on how to define delegation boundaries, approval authority, evidence requirements, and escalation rules when AI-generated artifacts may affect merge or release decisions. The aim of this study is to develop a governance-oriented framework for the controlled use of AI agents in software testing. The research applies a design-oriented conceptual methodology that synthesizes evidence from software engineering, software testing, and trustworthy artificial intelligence governance literature. As a result, two linked methodological artifacts are proposed: a human-in-the-loop assurance model and an activity-level risk/control matrix. The model distinguishes assistive, supervised, and conditional autonomous modes, while the matrix relates testing activities, artifact criticality, and autonomy level to required human controls, traceability obligations, and escalation triggers. Risk is operationalized through a heuristic composite formulation (S = C + I + U + L + V), used to calibrate governance intensity rather than to support statistical prediction. The practical value of the study lies in providing a structured baseline for integrating AI-agent-supported testing into continuous integration and continuous delivery workflows. The main limitation is the absence of empirical cross-context validation of the proposed governance parameters.
Published in: International Science Journal of Engineering & Agriculture
Volume 5, Issue 2, pp. 21-27