Search for a command to run...
Introduction Transformer-based deep learning has shown great potential in medical imaging, but its real-world applicability remains limited due to the scarcity of annotated data. This study aims to develop a practical framework for the few-shot deployment of pretrained MRI transformers across diverse brain imaging tasks. Methods We employ a Masked Autoencoder (MAE) pretraining strategy on a large-scale, multi-cohort brain MRI dataset comprising over 31 million 2D slices to learn transferable representations. For classification tasks, a frozen MAE encoder with a lightweight linear head (MAE-classify) is used. For segmentation, we propose MAE-FUnet, a hybrid architecture that fuses pretrained MAE embeddings with multi-scale CNN features. Extensive evaluations are conducted on multiple datasets, including NACC, ADNI, OASIS, NFBS, SynthStrip, and MRBrainS18, under controlled few-shot settings. Results The proposed framework achieves state-of-the-art performance in MRI sequence classification, reaching an accuracy of 99.24% with only 6,152 trainable parameters. For segmentation tasks, MAE-FUnet consistently outperforms strong baselines, achieving superior Dice and IoU scores across skull stripping and multi-class anatomical segmentation benchmarks. The model also demonstrates enhanced robustness and stability under data-limited conditions, with lower performance variance compared to competing methods. Discussion These results highlight the effectiveness of pretrained MAE representations for few-shot medical imaging tasks. The proposed framework enables efficient, scalable, and adaptable deployment of transformer-based models in data-constrained clinical environments. The fusion of global transformer embeddings with local CNN features provides a generalizable design paradigm for a wide range of medical imaging applications.