Enhancing Speech Emotion Recognition with Deep Learning Techniques

20260 citationsJournal Articlehybrid Open Access

Authors

Dr. Y. Murali Mohan Babu · n&k Technology (United States)

G. Vaishnavi · n&k Technology (United States)

CH. Prasant · n&k Technology (United States)

G. Mounika · n&k Technology (United States)

P. Srinivasulu · n&k Technology (United States)

Abstract

Recent advancements in speech emotion recognition (SER) have primarily centered on effective feature selection from acoustic data. This study introduces a novel SER algorithm that leverages raw speech data to enhance recognition accuracy, eliminating the need for manually selected acoustic features. Our approach integrates a Residual Convolutional Neural Network (R-CNN) model to detect emotions directly from raw speech signals and a Conformer Transformer model to capture long-range dependencies and temporal features in speech. The R-CNN model processes the raw audio, extracting emotional cues for accurate classification without relying on pre-selected acoustic features, thus capturing subtle emotion-driven nuances that traditional methods may overlook. Simultaneously, the Conformer Transformer model processes speech data to learn complex representations of the emotional content. Similarly, Long Short-Term Memory (LSTM) models are utilized to capture the sequential nature of speech signals, further enhancing the emotion recognition process. Evaluated across three public datasets in multiple languages, the proposed model demonstrates a notable improvement in accuracy and interpretability by leveraging both emotional and temporal information. This approach highlights the benefits of a multi-model framework that combines deep learning architectures, pushing the boundaries of affective computing through a more holistic understanding of speech data

Topics & Keywords

Emotion and Mood Recognition Sentiment Analysis and Opinion Mining Music and Audio Processing

UN Sustainable Development Goals

Quality Education

Publication Details

Published in: Technix International Journal for Engineering Research

Volume 13, Issue 3

DOI: 10.56975/tijer.v13i3.161575

Field-Weighted Citation Impact: 0.00