Search for a command to run...
Business Email Compromise (BEC) has emerged as one of the most financially devastating and strategically sophisticated forms of cyber-enabled fraud, leveraging advanced social engineering techniques to circumvent conventional email security infrastructures. Existing detection mechanisms, predominantly rule-based or static in nature, exhibit limited adaptability to the dynamic, context-aware, and linguistically nuanced strategies employed by modern attackers. This study proposes an adversarially resilient hybrid detection framework that synergistically integrates Natural Language Processing (NLP), classical machine learning models (Support Vector Machines and Random Forest), and deep learning architectures, including Long Short-Term Memory (LSTM) networks and Bidirectional Encoder Representations from Transformers (BERT). To address the critical challenge of limited labeled BEC datasets, a controlled synthetic data augmentation strategy was implemented using a fine-tuned Generative Pre-Trained Transformer (GPT), enabling the generation of high-fidelity adversarial email samples. A comprehensive hybrid feature engineering approach was adopted to capture the multifaceted characteristics of BEC emails, encompassing linguistic, structural, metadata, stylometric, and contextual attributes. Model training and evaluation were conducted using stratified cross-validation, with performance assessed through accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). Model interpretability was enhanced through SHapley Additive exPlanations (SHAP), providing transparent insights into feature contributions. Empirical results demonstrate that the LSTM model achieved superior performance, attaining an accuracy of 98.5%, significantly outperforming Random Forest (95.3%), Support Vector Machines (94.8%), and baseline rule-based approaches (85.4%). The proposed framework demonstrates strong potential for real-world deployment within enterprise email security ecosystems. Future work will focus on multilingual detection, real-time system integration, and large-scale validation within operational Security Operations Center (SOC) environments.
Published in: JOURNAL OF HIGH-FREQUENCY COMMUNICATION TECHNOLOGIES
Volume 04, Issue 02, pp. 466-480
DOI: 10.58399/ecfa6495