Fairness Audit Playbook: A Four-Phase Operational Framework for Detecting and Diagnosing Algorithmic Bias

20260 citationsBookgreen Open Access

Authors

Hadi Noori · Canadian Association of Nurses in Oncology

Abstract

Most organizations know their AI systems may be biased. What they lack is a systematic way to find where that bias lives — and what to do about it. This playbook provides that structure. It is the first in a series of four governance instruments developed within a doctoral research program in Information Systems Management, with regulatory alignment to major AI governance frameworks. The framework sits at the intersection of two bodies of work that have remained largely separate: the mathematical foundations of algorithmic fairness (Barocas, Hardt & Narayanan, 2023; Chouldechova, 2017; Kleinberg et al., 2017) and high-level organizational governance standards, including the NIST AI RMF 1.0 and the EU AI Act. This playbook neither introduces new fairness theory nor serves as a policy document. It provides the procedural infrastructure that bridges the two — translating both into executable practice. Framework Overview Phase 1 — Contextual Discovery (HCAT) The Historical Context Assessment Tool is a systematic pre-analysis protocol that surfaces proxy variables embedded in data through historical discrimination patterns — risks that algorithmic feature removal alone cannot detect. It runs before any technical work begins, bringing together data scientists and domain specialists to examine the data's origins and embedded assumptions. The conditions under which HCAT proxy detection is insufficient are explicitly noted within the tool itself. Phase 2 — Fairness Definition Selection The decision tree translates abstract ethical goals into mathematically rigorous metric selection. It proceeds in two steps. The first is empirical: are base rates equal across groups in the training data? This determines whether the impossibility constraints established independently by Chouldechova (2017) and Kleinberg et al. (2017) bind — it is not possible to simultaneously satisfy all common group fairness criteria when base rates differ. The second step is normative: which type of error causes more harm in this specific deployment context? These two determinations route practitioners to a single primary metric appropriate to the nature of the intervention — punitive or assistive — and the relative cost of false positives versus false negatives. Metric selection outputs are designed to require Legal and Compliance sign-off before proceeding — a requirement formalized in the governance framework detailed in the Implementation Guide. Phase 3 — Bias Identification and Prioritization This stage introduces a taxonomy of six bias sources — Measurement, Representation, Aggregation, Evaluation, Deployment Bias, and Feedback Loop. The taxonomy determines where bias originates; a prioritization matrix then scores each identified fairness bug by Severity × Persistence × Remediation Difficulty to determine what must be fixed before deployment. Fairness bugs exceeding a priority score of 50 are defined as deployment-blocking. An exception escalation protocol is triggered when intersectional harm is identified below the 50-point threshold: in such cases the same sign-off chain (Product Manager approval, Legal review) is required regardless of the numeric score. Phase 4 — Metrics and Validation Quantitative measurement is anchored by six deployment gates: Adverse Impact Ratio (AIR ≥ 0.80, the sole statutory threshold under EEOC 29 CFR § 1607), Equal Opportunity Difference (EOD), Equalized Odds Difference, Demographic Parity Difference, Bootstrap CI Width, and Intersectional AIR. Bootstrapped 95% confidence intervals separate statistically significant disparities from sampling noise — with sample-size thresholds clearly flagged where those intervals become unreliable. The playbook also includes a mandatory intersectional Min-Max scan and executable Python code implementing all fairness metrics and validation steps, requiring no external fairness libraries. The framework treats fairness auditing as a repeatable engineering discipline — moving from contextual data analysis through metric selection, bias taxonomy, and statistical validation to a signed deployment decision. Included Case Study The framework is applied to a complete audit of a constructed illustrative scenario: an internal employee bridge loan assistance program. An initial review concluded the model was unbiased because it excluded explicit protected attributes. Applying the HCAT revealed that Years_at_Current_Address and Voluntary_Overtime functioned as proxies for race and caregiver status respectively. Aggregate analysis produced an 8% gender gap within common tolerance thresholds — a result that would ordinarily close the review. Intersectional analysis told a different story. On the Race × Gender axis, Black women faced a 22-percentage-point approval gap despite carrying comparable qualifications and loan amounts to approved applicants in the same risk band — a disparity invisible at the aggregate level. A separate scan on the Race × Family Status axis identified a 14-percentage-point approval gap for single-parent households versus dual-income households in the same risk band. This secondary finding was flagged as a Fairness Bug with remediation deferred to the next retraining cycle. Following remediation, the Adverse Impact Ratio rose from 0.66 to 0.85, clearing the Four-Fifths Rule threshold and meeting EU AI Act non-discrimination requirements for high-risk credit systems. The default rate shifted by less than 0.5% — confirming that the biased features carried no predictive value that neutral alternatives could not replace. Scope and Limitations This playbook addresses traditional supervised ML systems deployed in lending, hiring, healthcare, and fraud detection contexts. It does not cover generative AI or agentic systems, and its proxy-detection logic assumes a structured tabular data environment. Each instrument includes explicit discussion of the conditions under which its assumptions break down — for instance, the HCAT's reliance on available demographic baselines, or the bootstrapping module's sensitivity to small group sizes. Series Context This is Playbook 1 of a four-part Algorithmic Fairness Framework: Playbook 1 (this document): Fairness Audit Playbook — Bias Detection and Diagnosis Playbook 2: Fairness Intervention Playbook — Mitigation Strategies and Causal Analysis Playbook 3: Fairness Implementation Playbook — Organizational Governance and Agile Integration Playbook 4: Fairness Monitoring Playbook — Prevention and Continuous Drift Detection Each playbook is independently usable but designed to build on the previous one. Together, they form a full lifecycle framework — from initial audit through remediation, governance integration, and long-term monitoring — developed within a doctoral research program examining the organizational conditions required for responsible AI deployment in enterprise contexts. References Barocas, S., Hardt, M., & Narayanan, A. (2023). Fairness and Machine Learning: Limitations and Opportunities. MIT Press. Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data, 5(2), 153–163. Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. Proceedings of Innovations in Theoretical Computer Science (ITCS). National Institute of Standards and Technology. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST.

Topics & Keywords

Ethics and Social Impacts of AI Explainable Artificial Intelligence (XAI)AI and HR Technologies

UN Sustainable Development Goals

Peace, Justice and strong institutions

Publication Details

Published in: Zenodo (CERN European Organization for Nuclear Research)

DOI: 10.5281/zenodo.19041275

Field-Weighted Citation Impact: 0.00