Search for a command to run...
Out-of-distribution (OOD) failure is usually framed as a problem of generalization: models degrade when deployment inputs differ in relevant respects from those represented in training. That framing is useful, but structurally incomplete. It captures failure relative to a fixed evaluative standard, typically formalized as a loss function, reward signal, or learned preference model, while leaving unaddressed a prior question: whether failure can function, for the system itself, as evidence against that evaluative standard. This paper argues that contemporary machine learning systems cannot internalize that second-order possibility. Where the evaluative function ℒ lies outside the system’s domain of operation, failure can only be processed as deviation under ℒ, not as evidence that ℒ is itself inadequate for the encountered domain. On this basis, the paper distinguishes between first-order failure, understood as degraded performance relative to ℒ, and second-order failure, understood as inadequacy of ℒ as an evaluative ordering over candidate outputs. It then introduces a minimal coherence-monitoring function, C, to distinguish performative failure from normative absence, and argues that current architectures reliably instantiate the former while lacking any system-internal mechanism for the latter. OOD failure is therefore reinterpreted not merely as a performance defect, but as a structural signal of epistemic enclosure: a condition in which evaluative standards govern behaviour without themselves being subject to system-level revision. The contribution is classificatory rather than architectural. It clarifies a structural limit of contemporary AI systems, identifies the threshold at which epistemic enclosure would be breached, and situates OOD behaviour within broader questions of evaluative control, epistemic agency, and responsibility.