Search for a command to run...
Abstract - Chronic psychological stress impairs cognitive performance, academic outcomes, and long-term well-being, yet most automated detection systems rely on a single sensing modality, limiting their robustness under real-world conditions. Unimodal approaches—whether vision-based, physiological, or acoustic—are individually vulnerable to noise, occlusion, and signal artifacts, motivating the need for integrated multimodal frameworks. This paper presents Neurolens, a real-time multimodal stress detection system that concurrently processes facial video through a fine-tuned You Only Look Once version 8 (YOLOv8) model trained on a publicly available facial emotion dataset, wearable physiological signals—including electrodermal activity (EDA), blood volume pulse (BVP), and skin temperature— through a hybrid convolutional neural network–long short- term memory (CNN-LSTM) architecture trained on the WESAD wearable stress and affect detection dataset, and speech audio through a Wav2Vec2 transformer-based speech encoder. A weighted late-fusion module integrates per- modality stress scores into a unified Stress Index rendered on an interactive real-time dashboard with adaptive push notifications and ambient brightness control. System demonstrations confirm correct identification of stress- indicative facial states such as anger and elevated physiological arousal from CSV-uploaded sensor data, alongside neutral baseline detection with appropriately reduced stress index values. These results establish Neurolens as a scalable, non-invasive, and reproducible framework for continuous passive stress monitoring in academic, clinical, and professional environments. Keywords—multimodal fusion; Wav2Vec2; CNN-LSTM; facial emotion recognition; speech emotion recognition; wearable sensors.
Published in: INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
Volume 10, Issue 03, pp. 1-9
DOI: 10.55041/ijsrem58725