Search for a command to run...
With the rapid advancement of technology and Artificial Intelligence (AI), the misuse of AI-generated content has significantly increased. Since the onset of the COVID-19 pandemic, the proliferation of AI-generated fake (deepfake) videos, audio, and images has risen markedly. While considerable progress has been achieved in the domains of video and image deepfakes, the field of audio deepfakes remains relatively underexplored. Additionally, with the growing popularity of Machine Learning (ML) and Deep Learning (DL) methods, classical signal processing approaches—which emphasize capturing essential features from speech signals—are increasingly being overlooked in favor of large-scale data-driven models and transformer-based features. In this study, we address the Audio Deepfake Detection (ADD) task using classical wavelet-based signal processing techniques. Specifically, we employ six different types of wavelets—Bump, Morlet, Morse, Shannon, Mexican Hat, and Derivative of Gaussian (DoG)—to extract discriminative features for the ADD task. Among these, the Bump wavelet combined with a Convolutional Neural Network (CNN) classifier achieved the highest detection accuracy of 94.15%. Furthermore, to assess the real-world applicability of the proposed approach, we also conducted a latency-based analysis.