Search for a command to run...
Major Depressive Disorder (MDD) is frequently under-identified in educational contexts, including arts and design programs, where limited clinical resources and privacy concerns restrict routine screening. Studio-based curricula may introduce critique- and portfolio-driven stressors, motivating low-burden decision support for triage rather than diagnosis. Face-based machine learning can provide a non-intrusive signal when designed to be privacy-preserving and compute-efficient; however, prior work often relies on multimodal inputs, under-reports calibration/robustness, and is not optimized for on-device deployment. We present OCFA (Optimized CNN-Based Facial Analysis), a lightweight face-only pipeline that integrates (1) a RobFaceNet-style backbone with Adapt-Coordinate Attention, (2) multi-objective evolutionary model selection under explicit efficiency and reliability constraints, and (3) post-hoc temperature scaling to improve probabilistic calibration with zero inference-time FLOPs. Because supervision is available at the interview/session level rather than per frame, OCFA aggregates frame-level evidence into a session score (pooling/MIL variants) while enforcing strict subject/session-wise separation to prevent identity leakage. Experiments use the DAIC-WOZ depression subset with PHQ-8-derived labels (binary screening cutoff PHQ-8 ≥ 10) and official train/dev/test partitions; cross-domain generalization is assessed on E-DAIC as an external benchmark with a controlled shift in interview dynamics (WoZ-controlled vs AI-controlled sessions). On the official DAIC-WOZ test split, OCFA achieves 82.98% accuracy, 82.61% F1, AUROC= 0.886, and post-scaling ECE≈ 0.040 at 0.065 GMac (112×112) with 3.80 M parameters. Under the same frozen operating point (no external-set re-tuning), OCFA attains 81.10% accuracy, 80.20% F1, and AUROC= 0.874 on E-DAIC. For privacy-aligned interpretability, we report SHAP-based global feature importance without exposing identifiable face imagery. OCFA is designed as a calibrated, on-device compatible risk estimator for human-in-the-loop screening workflows (including art and design schools). Prospective validation under real educational capture conditions remains necessary before operational deployment.
Published in: Brain Research Bulletin
Volume 237, pp. 111829-111829