Search for a command to run...
This repository is the replication package for Look Back Before You Bisect: A Risk-Aware Approach for Efficient Bisection. It contains the code and artifacts needed to reproduce the paper's Mozilla history-based workflow for risk-aware culprit localization. The package covers four connected parts of the study: construction of the MozillaJIT dataset from Mozilla Bugzilla and Autoland history, conversion of MozillaJIT into the structured commit representation used for learning, training and inference of commit-level risk models on MozillaJIT, and simulation of lookback-plus-bisection localization strategies using those risk scores. The standalone MozillaJIT dataset used by this package is archived separately at https://doi.org/10.5281/zenodo.18829451. For more details and the instructions on how to replicate the research, look at the README file in the root of package. Within the repository, the MozillaJIT extraction pipeline is implemented in data_extraction/bugzilla/, data_extraction/mercurial/, data_extraction/data_preparation.py, and data_extraction/utils.py. These scripts collect Bugzilla bugs, export ordered Autoland commit metadata, link regressions to landed code changes, compute one net diff per bug, and generate the structured XML-like diff representation used by the models. The risk-model part of the paper is implemented in llama/train.py, llama/run_inference.py, and the MozillaJIT-specific config templates under llama/configs/templates/, which support the ModernBERT and LLaMA-3.1 8B experiments described in the paper. The simulation framework used for the paper's end-to-end evaluation is implemented in analysis/git_bisect/, including lookback strategies, bisection strategies, the main simulator entrypoint, the precomputed risk prediction files for the evaluation and final-test splits, and frozen copies of the paper's final result JSON files and Pareto-front artifacts. The repository name jit-dp-llm is historical, and several code paths use short internal names that differ from the terminology used in the paper. In particular, the paper's StandardMidpointBisection appears in the code as GitBisectBaseline with strategy code GB, and the paper's NoLookback + StandardMidpointBisection baseline appears in result files as NLB+GB. Likewise, the paper's localization-step metric appears in the simulator outputs as test-count metrics such as total_tests, mean_tests_per_search, and max_tests_per_search. The archived MozillaJIT snapshot bundled for experiments also uses historical filenames such as mozilla_jit_2022.jsonl and jit_llm_struc_2022.jsonl, while some scripts still default to the generic names mozilla_jit.jsonl and jit_llm_struc_2025.jsonl. This README documents those name differences so the replication workflow can be followed without ambiguity.