Search for a command to run...
Objective: The integration of artificial intelligence (AI) into computer-aided detection (CAD) is a major innovation in lung cancer diagnosis. However, its reliability in detecting the earliest radiographic sign—faint ground-glass opacities (GGOs) indicating pre-invasive adenocarcinoma—remains a critical, unquantified gap. This study aimed to perform a rigorous failure analysis to define the specific conditions under which commercial AI/CAD systems fail in a low-dose CT (LDCT) screening setting. Methods: In this retrospective diagnostic accuracy study, a primary cohort of 100 patients and an external validation cohort of 50 patients with moderate/low-risk nodules on LDCT were included. An expert reference standard was established by a consensus panel of three thoracic radiologists. Two independent, commercially deployed AI/CAD systems from different vendors (Vendor A & Vendor B) processed all cases. Nodules confirmed by experts but missed by AI were analyzed. Their morphology was categorized, and their mean CT attenuation (HU) was measured via manual region-of-interest placement. Results: The AI systems demonstrated significant and comparable false negative rates in the combined cohort: 12.7% for Vendor A and 14.7% for Vendor B. The vast majority of missed nodules were GGOs (92.3% and 78.6%, respectively, in the primary cohort). Crucially, quantitative analysis revealed a consistent density threshold for AI failure: the mean CT value of missed GGOs was −737 ± 51.50 HU for Vendor A and −727 ± 70.07 HU for Vendor B. This algorithmic blind spot was fully corroborated by the external validation cohort (−741 ± 48.2 HU and −733 ± 62.5 HU, respectively). Anatomical complexity (juxta-pleural/endobronchial location) was a secondary failure factor. Conclusions: This study identifies a quantifiable “−730 HU blind spot” as a common limitation of current commercial AI/CAD systems in diagnosing early lung adenocarcinoma. This finding represents a pivotal advancement in understanding AI’s role in diagnostics: it is not infallible. To innovate and safeguard screening efficacy, radiologists must adopt a human–AI collaborative model with mandated manual verification targeting low-attenuation opacities, ensuring this diagnostic innovation fulfills its promise while mitigating the risks of overdiagnosis.