Search for a command to run...
This paper investigates the task of automatic word alignment in parallel texts, a fundamental step for training machine translation systems, conducting comparative linguistic studies, and creating linguistic resources. Given the scarcity of annotated data for many language pairs, the applicability of Large Language Models (LLMs) becomes particularly relevant due to their high generalization capabilities and ability to solve tasks without extensive fine-tuning on target datasets. This study presents a comparative analysis of the effectiveness of modern general-purpose LLMs versus specialized alignment algorithms using Russian-English parallel data. The research involved testing ten state-of-the-art models (including Gemini 3 Pro, GPT-5.2, and Claude Sonnet 4.5) using various prompting strategies (zero-shot, few-shot), alongside five baseline approaches ranging from statistical methods (fast-align, eflomal) to neural network architectures (AwesomeAlign, AccAlign, BinaryAlign). Performance was evaluated based on Precision, Recall, F-measure, and Alignment Error Rate (AER) metrics using annotated data from the Russian National Corpus. Experimental results indicated that the specialized BinaryAlign algorithm maintains the lead in overall alignment quality (F-measure 0.883, AER 0.113). However, leading LLMs, specifically Gemini 3 Pro Preview and GPT-5.2, demonstrated results surpassing those of most classic and early neural network baselines. Notably, for the most effective models, including in-context examples often reduced performance compared to the zero-shot setting. Thus, modern LLMs can serve as a reliable tool for high-quality alignment in the absence of training data, opening new perspectives for processing low-resource language pairs.
Published in: Modeling and Analysis of Information Systems
Volume 33, Issue 1, pp. 48-61