Search for a command to run...
Background: The global burden of mental disorders continues to escalate, necessitating scalable, evidence-based interventions. Artificial intelligence (AI)-delivered health promotion programs represent a promising approach to addressing treatment gaps by targeting the neuropsychological mechanisms that underlie mental health outcomes. This meta-analysis synthesizes evidence on the effectiveness of AI-delivered interventions in improving executive function, emotion regulation, and clinical outcomes across diverse populations. Methods: A systematic search identified 186 studies (n = 22,755 participants) published between 2020 and 2025. Random-effects meta-analyses estimated pooled effect sizes (Hedges’ g, calculated as between-group standardized mean differences with small-sample correction [J = 1 − 3/(4df − 1)]) for primary outcomes. Between-study heterogeneity was quantified using I2 and τ2 statistics. To address dependency among effect sizes from studies reporting multiple outcomes, robust variance estimation (RVE) was employed. Subgroup analyses examined intervention modalities, delivery formats, and clinical populations. Moderator analyses explored sources of heterogeneity, including publication year, sample size, intervention duration, control condition type, risk-of-bias rating, geographic region, and AI sophistication tier, and mediational models tested putative therapeutic mechanisms. Results: AI-delivered interventions demonstrated a significant overall effect on health outcomes (g = 0.68, 95% CI [0.58, 0.78]; τ2 = 0.12; I2 = 73.4%). Executive function outcomes showed moderate effects (g = 0.61, τ2 = 0.08), with working memory improvements being strongest (g = 0.72). Emotion regulation outcomes demonstrated moderate-to-large effects (g = 0.61, 95% CI [0.51, 0.70], τ² = 0.006); formal subgroup pooled estimates by emotion regulation strategy were not calculated due to insufficient studies per strategy (k < 3 per category); individual study effect sizes ranged from g = 0.27 to g = 1.11. Among 41 studies examining neuropsychological mechanisms, convergent patterns suggested involvement of prefrontal neural circuits (DLPFC), enhanced alpha-band activity, and improved heart rate variability; however, formal mediation was tested in only 18 studies (9.7%). Among clinical populations, interventions for cognitive impairment yielded the largest effects (g = 1.02; this finding should be interpreted cautiously given modest cumulative sample size [N = 482], potential small-study effects [Egger’s p = 0.08], and trim-and-fill adjusted estimate of g = 0.85), followed by mental health conditions (g = 0.72), while other clinical populations showed smaller but significant improvements (g = 0.19). Mobile applications (g = 0.78) and chatbot-based interventions (g = 0.74) demonstrated the strongest effects among delivery formats. Among studies testing formal mediation, analyses suggested mindfulness (β = 0.42), decentering (β = 0.38), and cognitive reappraisal (β = 0.45) as processes associated with therapeutic outcomes. Conclusions: AI-delivered health promotion programs demonstrate significant effectiveness across executive function, emotion regulation, and clinical outcomes, though substantial heterogeneity (I2 = 45–82%) indicates meaningful variability warranting attention to subgroup-specific effects. Given the diversity of intervention types included (chatbots, mobile apps, VR systems, neuromodulation), pooled estimates should be interpreted as characterizing the average effect across this heterogeneous landscape; subgroup-specific estimates provide more precise guidance for clinical decision-making regarding specific modalities. Effects are associated with convergent patterns of neuropsychological mechanisms, though mechanistic conclusions remain preliminary given that only 22% of studies (41/186) examined neuropsychological mechanisms, and formal mediation analyses were conducted in only 18 studies (9.7%); most of the mechanistic evidence is correlational rather than causal. Future research should establish standardized AI taxonomies, optimize adaptive algorithms, conduct adequately powered replication studies in populations with cognitive impairment, prioritize experimental mediation designs to establish causal pathways, and evaluate long-term maintenance effects with a minimum of 6–12-month follow-up periods.