A novel audio-visual model with real-time gradient modulation for mental disorder detection

20260 citationsJournal Articlehybrid Open Access

Authors

R. Nair · Cumbria Northumberland Tyne and Wear NHS Foundation Trust

Syed Mohsen Naqvi · Newcastle University

Abstract

Recently, mental disorders have emerged as one of the major contributors to global healthcare challenges. Deep learning methods based on fMRI and EEG have improved the efficiency and accuracy of detecting certain mental disorders. However, these methods often entail substantial costs for equipment and trained staff. Furthermore, most models are designed for specific mental disorders rather than serving as potential tools for widespread screening. This paper focuses on the emotional expression features of mental disorders and introduces a diagnosis model based on audio-visual. The proposed model incorporates a spatio-temporal (S-T) attention mechanism combined with Convolutional Neural Networks (CNNs) and employs Real-Time Gradient Modulation (RTGM). This model effectively captures audio-visual features while dynamically adjusting the contributions of both modalities during training to optimize performance for two mental disorders. Additionally, we introduce dynamically varying Gaussian noise to prevent potential degradation of generalization ability caused by modulation. The effectiveness and feasibility of the proposed model are validated through comparative analyses of various networks, fusion strategies, and modulation methods across three datasets focused on the diagnosis and analysis of two mental disorders: ADHD and depression. The proposed model demonstrates state-of-the-art performance, achieving over 90% accuracy for ADHD classification and improving depression score estimation on AVEC 2013 and AVEC 2014. • Unified audio-visual DL for scalable, low-cost ADHD and depression pre-screening. • Real-Time Gradient Modulation balances modalities and preserves interpretable cues. • Co-developed with CNTW-NHS; validation shows robust results.

Topics & Keywords

EEG and Brain-Computer Interfaces Emotion and Mood Recognition Attention Deficit Hyperactivity Disorder

Publication Details

Published in: Biomedical Signal Processing and Control

Volume 120, pp. 110164-110164

DOI: 10.1016/j.bspc.2026.110164

Field-Weighted Citation Impact: 0.00

Command Palette

A novel audio-visual model with real-time gradient modulation for mental disorder detection

Authors

Abstract

Topics & Keywords

Publication Details