Search for a command to run...
Phishing attacks continue to threaten information security by impersonating legitimate brands to deceive users. Accurate identification of targeted brand names forms the foundation of effective phishing detection, as it enables systems to recognize which legitimate entities attackers are attempting to spoof. This identification serves multiple purposes: detecting phishing attempts, analyzing criminal targeting patterns, and protecting brand reputation. Traditional machine learning approaches for brand identification suffer from a critical limitation: they require complete retraining whenever new brands emerge or existing ones evolve, leading to detection gaps and high operational costs. Moreover, existing systems either depend on pre-collected brand-specific data that limits coverage or rely on real-time search queries that introduce prohibitive latency for large-scale operations. In this study, we propose BrandSpotter, a framework that specializes large language models (LLMs) for brand name extraction and applies them to targeted brand name identification. While modern general-purpose LLMs can process extensive token sequences, they demand substantial computational resources; BrandSpotter achieves efficiency by limiting token capacity and applying task-specific fine-tuning to create a lightweight model optimized for brand extraction. The extraction task of BrandSpotter operates without requiring pre-collected brand-specific artifacts such as logos or screenshots. It relies only on simple brand name strings for final labeling, eliminating retraining requirements and achieving high-speed processing with an average of approximately 10 milliseconds per sample. To evaluate BrandSpotter, we constructed and tested the model using datasets composed of samples with different brands during training and testing, and assessed the performance of targeted brand name identification. The results demonstrate that BrandSpotter can identify targeted brand names with an accuracy of 94.0%, even when the brand list differs from the one used during training. Furthermore, the model successfully identifies brand names in samples containing brands unknown to the model.