Search for a command to run...
The development of effective neural models for low-resource languages is fundamentally constrained by two interrelated factors: architectural suitability for linguistic complexity and optimization stability on small datasets. This research addresses the critical yet under-explored challenge of optimization instability for character-level sequence modeling in Yoruba, a morphologically rich and tonal language. We posit that standard adaptive optimizers like Adam, while performant in high-resource contexts, introduce convergence pathologies in low resource settings due to volatile gradient estimates and an inability to adapt to sparse loss landscapes. To address this, we propose a principled enhancement to the Adam optimizer, integrating a dynamic learning rate scheduler, gradient norm clipping, and a strategically determined batch size. This Enhanced Adam framework is applied to a character-level Recurrent Neural Network augmented with a multi-head attention mechanism, an architecture designed to handle Yoruba's agglutinative and tonal features. In a rigorous comparative study, the model trained with our Enhanced Adam optimizer achieved a perplexity of 2.07, a statistically significant 8.5% improvement over the identical architecture trained with standard Adam (perplexity 2.26). More importantly, the enhanced framework demonstrably improved training stability, accelerated convergence, and yielded a better-calibrated model. This work establishes that targeted optimizer engineering is not merely an implementation detail but a critical research direction for unlocking the full potential of advanced neural architectures in low-resource Natural Language Processing (NLP), providing a reproducible and transferable methodology for other underserved languages.
Published in: International Journal of Research and Innovation in Applied Science
Volume 10, Issue 10, pp. 1940-1959