Search for a command to run...
Recent advances in legged robotics have highlighted deep reinforcement learning (DRL)-based controllers for their robust adaptability to diverse, unstructured environments. While position-based DRL controllers achieve high tracking accuracy, they offer limited disturbance rejection, which degrades walking stability; torque-based DRL controllers can mitigate this issue but typically require extensive time and trial-and-error to converge. To address these challenges, we propose a Real-Time Motion Generator (RTMG). At each time step, RTMG kinematically synthesizes end-effector trajectories from target translational and angular velocities (yaw rate) and step length, then maps them to joint angles via inverse kinematics to produce imitation data. The RL agent uses this imitation data as a torque bias, which is gradually annealed during training to enable fully autonomous behavior. We further combine the RTMG-generated imitation data with a decaying action priors scheme to ensure both initial stability and motion diversity. The proposed training pipeline, implemented in NVIDIA Isaac Gym with Proximal Policy Optimization (PPO), reliably converges to the target gait pattern. The trained controller is Tensor RT-optimized and runs at 50 Hz on a Jetson Nano; relative to a position-based baseline, torso oscillation is reduced by 24.88% in simulation and 21.24% on hardware, demonstrating the effectiveness of the approach.