Tokenization Induced Morphological Degradation in Arabic Language Models and a Structured Embedding Alternative: Root-Pattern Tensor Subspaces
20260 citationsPreprintgreen Open Access
Tokenization Induced Morphological Degradation in Arabic Language Models and a Structured Embedding Alternative: Root-Pattern Tensor Subspaces | Researchclopedia