Search for a command to run...
Standard subword tokenization methods fragment numbers inconsistently, causing large language models (LLMs) to lose positional and decimal structure—a primary driver of errors in arithmetic and scientific reasoning. We introduce Triadic Suffix Tokenization (TST), a deterministic scheme that partitions digits into three-digit triads and annotates each triad with an explicit magnitude marker. Critically, the scheme defines a \emph{fixed, one-to-one mapping} between suffixes and orders of magnitude for the integer part (thousands, millions, billions, etc.) and a parallel system of replicated markers for fractional depth (tenths, thousandths, millionths, etc.). This contrasts with approaches that only group digits (e.g., commas), which leave magnitude to be inferred from position. The scheme adds at most 10,000 fixed tokens to an existing vocabulary, covers 33 orders of magnitude (\(10^{-15}\) to \(10^{18}\)), and preserves exact digits while making order-of-magnitude relationships transparent at the token level. TST is architecture-agnostic and can be integrated as a drop-in preprocessing step. Experimental validation is deferred to future work.