Multimodal sentiment analysis: emerging innovations, core challenges, and future directions

20260 citationsJournal Articlegold Open Access

Authors

Megha Dhotay · Symbiosis International University

Madhuri Dharrao · Dr D Y Patil Dental College & Hospital

Sarika T. Deokate · Dr D Y Patil Dental College & Hospital

Anupkumar M. Bongale · Symbiosis International University

Deepak Dharrao · Symbiosis International University

Abstract

Multimodal sentiment analysis (MSA) has emerged as one of the most dynamic and rapidly advancing areas within the field of artificial intelligence. It combines audio, visual, and textual data to gain a better knowledge and understanding of emotions based on online communication. However, unlike unimodal sentiment analysis, which frequently overlooks such details as sarcasm or cross-cultural emotional cues, MSA incorporates more sophisticated methods dealing with such shortcomings: attention mechanisms, hierarchical fusion, and transformer-based architectures, among others. The study presents a critical assessment of 58 studies reviewed in the past 12 years (2010–2025) and utilizes PRISMA methodology, which limits the risk of selecting literature insufficiently. The main topics covered are fusion techniques (early, late, and hybrid), sophisticated feature extraction approaches, and databases testing (e.g. CMU-MOSEI, MELD, MOSEAS). Some of the problems mentioned in detail include high computational complexity, modalities, bad synchronization, bias in the dataset, and lack of real-time applications. Moreover, a lack of interdisciplinary work and common grounds between the psychological theories and AI models is noted in this review, as well. Applications in healthcare, education, HCI and monitoring of moods among the people are demonstrated demonstrating the real-life applicability of MSA. Finally, the study establishes some crucial research gaps and suggests future research paths toward the multimodal systems that are scalable and culturally-sensitive and ethically-responsible and can potentially operate in the multilingual and dynamically-based environment.

Topics & Keywords

Sentiment Analysis and Opinion Mining Emotion and Mood Recognition Humor Studies and Applications

UN Sustainable Development Goals

Industry, innovation and infrastructure

Publication Details

Published in: Discover Artificial Intelligence

DOI: 10.1007/s44163-026-01141-2

Field-Weighted Citation Impact: 0.00