Search for a command to run...
Abstract Background RNA regulation programs gene expression through sequence-encoded mechanisms—including RNA structure formation, protein binding, chemical modification, and RNA–RNA targeting—whose rules span nucleotide-scale motifs and longer-range context. RNA foundation models aim to learn transferable representations from large RNA corpora, but most rely on masked language modeling (MLM), where the model loss is computed on a small subset of positions and training relies on artificially corrupted inputs that are absent at downstream inference, introducing a pretraining–downstream discrepancy. Results We introduce RNAElectra, a single-nucleotide–resolution RNA foundation model pretrained on diverse non-coding RNAs from RNAcentral using ELECTRA-style replaced-token detection (RTD). RTD trains a discriminator with a loss defined over all input positions on realistically corrupted sequences, providing dense supervision that better aligns pretraining with sequence-to-function fine-tuning. RNAElectra combines nucleotide-resolution tokenization with an efficient attention design to capture local regulatory motifs and longer-range dependencies within a single reusable backbone. Using a unified, sequence-only fine-tuning pipeline without task-specific architectures or auxiliary inputs, RNAElectra demonstrates strong cross-task generalization across benchmarks and downstream evaluations spanning RNA structure and function, RNA–protein and RNA–RNA interactions, RNA modification landscapes, and quantitative regulatory readouts such as translation efficiency and mRNA stability, outperforming widely used RNA foundation model baselines on the majority of evaluated tasks. In addition to predictive performance, RNAElectra supports interpretability by enabling analysis of learned representations and sequence determinants underlying model predictions. Conclusion RNAElectra establishes RTD pretraining as a practical alternative to MLM for RNA foundation modeling, coupling dense position-wise supervision with an efficient architecture to deliver broadly transferable RNA representations. This framework provides a reusable backbone for RNA regulatory prediction and supports sequence-level RNA engineering and design.