Search for a command to run...
Trajectory planning is a central problem in autonomous driving, requiring long-horizon reasoning, strict safety guarantees, and robustness to rare but critical events. Recent learning-based planners increasingly formulate planning as an autoregressive sequence generation problem, analogous to large language models, where future motions are discretized into action tokens and predicted by Transformer-based neural sequence models. Despite promising empirical results, most existing approaches adopt time-domain action representations, in which consecutive actions are highly correlated. When combined with autoregressive decoding, this design induces degenerate generation behavior in learning-based planners, encouraging local action continuation and leading to rapid error accumulation during closed-loop execution, particularly in safety-critical corner cases such as sudden pedestrian emergence. To address this limitation of time-domain autoregressive planning, we propose a unified trajectory planning framework built upon three core ideas: (1) explicit action tokenization for long-horizon planning, (2) transformation of the action space from the time domain to the frequency domain, and (3) a hybrid learning paradigm that combines imitation learning with reinforcement learning. By representing future motion using compact frequency-domain action coefficients rather than per-timestep actions, the proposed planner is encouraged to reason about global motion intent before refining local details. This change in action representation fundamentally alters the inductive bias of learning-based autoregressive planning, mitigates exposure bias, and enables earlier and more decisive responses in complex and safety-critical environments. We present the model formulation, learning objectives, and training strategy, and outline a comprehensive experimental protocol.