Search for a command to run...
Decision Transformers (DTs) reformulate reinforcement learning as a conditional sequence modeling problem and have demonstrated competitive performance in offline Reinforcement Learning (RL) scenarios. However, their behavior in card games, specifically partially observable imperfect-information, trick-taking games remains underexplored. In parallel, general-purpose card-game toolkits have shown the value of unified environments and standardized evaluation protocols for accelerating research in imperfect-information games. Motivated by the goal of creating a general card-game-playing framework, we present a unified RL pipeline for trick-taking card games using DTs. While classical learning methods have demonstrated strong performance in card games, transformer-based reinforcement learning remains comparatively underexplored in this domain. This paper studies the applicability of DTs to the core play-phase of trick-taking games and evaluates whether a single, reusable pipeline can be transferred across multiple games in this class with minimal game-specific engineering. We propose a unified framework integrating offline pretraining, online selective expert iteration, and inference-time legal-action filtering. Crucially, our proposed approach demonstrates two key advantages over standard implementations. First, the model successfully internalizes complex game rules (e.g., follow-suit constraints) implicitly from the empirical data distribution, completely eliminating the need for explicit action masking during training. Second, we introduce a selective expert iteration mechanism equipped with strict acceptance filtering, which effectively prevents distribution collapse and enables safe, monotonic offline-to-online policy refinement. Ultimately, we show that this single, reusable transformer-based pipeline achieves competitive performance across multiple trick-taking domains (Hearts, Whist, and Spades) with minimal game-specific engineering.