Search for a command to run...
Background Aortic dissection (AD) is a life-threatening cardiovascular emergency with high morbidity and mortality. Accurate risk prediction is essential for timely intervention, yet traditional statistical models often fail to capture the complex, nonlinear interactions inherent in AD pathophysiology. In recent years, machine learning (ML) has emerged as a promising approach to improve prognostic accuracy. However, the overall performance, methodological quality, and clinical applicability of ML-based prediction models for AD have not been comprehensively evaluated. Objective This systematic review and meta-analysis followed PRISMA, CHARMS, and TRIPOD guidelines and was registered with PROSPERO (CRD420251154262). Six major databases (PubMed, Web of Science, Cochrane Library, Embase, CNKI, Wanfang) were searched from inception to September 30, 2025. Studies developing or validating ML models for predicting adverse outcomes in AD were included. Data extraction adhered to CHARMS, and risk of bias was assessed using PROBAST. Meta-analysis synthesized C-statistics (AUC) using fixed- or random-effects models depending on heterogeneity. Subgroup, sensitivity, and publication bias analyses were performed. Results Forty studies were included, covering outcomes such as early mortality, long-term mortality, acute kidney injury (AKI), neurological complications, gastrointestinal bleeding, mesenteric malperfusion, and composite adverse events. ML algorithms included random forest, SVM, XGBoost, LightGBM, neural networks, and ensemble approaches. The pooled C-statistic demonstrated excellent discriminative performance for early mortality (0.891, 95% CI: 0.854–0.927) and long-term mortality (0.847, 95% CI: 0.794–0.900), and strong performance for AKI prediction (0.825, 95% CI: 0.756–0.894). Many complication-specific models achieved AUCs above 0.90. However, these estimates must be interpreted with extreme caution. Significant heterogeneity was observed across analyses ( I 2 = 61.3–78.8%), and the PROBAST assessment revealed that 100% (40/40) of studies were at high or unclear risk of bias, predominantly due to deficiencies in the analysis domain (e.g., inadequate events-per-variable, lack of external validation). Adherence to TRIPOD reporting standards was suboptimal (average 78.7%), with critical shortcomings in reporting predictor definitions (62.5% unreported), sample size justification (82.5% unreported), and full model specifications (72.5% unreported). Methodological limitations were common, including inadequate events-per-variable ratios, a near-absence of robust external validation (only 5 of 40 studies), inconsistent outcome definitions, and incomplete reporting of model specifications. Furthermore, over a quarter (27.5%) of models omitted calibration assessment, and decision-curve analysis was rarely performed, limiting insights into clinical utility. Conclusion ML-based prediction models demonstrate strong potential for risk stratification in AD across multiple clinically relevant outcomes. However, current evidence does not justify their routine clinical implementation. The high reported performance metrics are likely optimistic estimates derived from methodologically weak studies. Future research should emphasize rigorous analytic frameworks, standardized outcome definitions, transparent reporting, and, most critically, multicenter external validation before these tools can be considered for real-world clinical utility. Systematic Review Registration https://www.crd.york.ac.uk/PROSPERO/view/CRD420251154262 , identifier CRD420251154262.