Search for a command to run...
Transient-source detection without relying on difference images still faces challenges in achieving high accuracy, especially under practical space-based astronomical survey conditions where the data volume is enormous, on-orbit transmission bandwidth is limited, and real-time response is required for rapid follow-up observations. To address these issues, this paper proposes a lightweight detection network that integrates multi-scale feature fusion with contextual feature extraction, enabling efficient real-time processing on resource-constrained edge devices. The proposed model enhances robustness to point-spread-function variations across observation conditions and to complex background environments, while simultaneously improving detection accuracy. To evaluate performance comprehensively, lightweight VGG and lightweight ResNet architectures and other baseline models—commonly used as baselines for transient-source detection—are adopted for comparison. Experimental results show that under the condition that the models have approximately the same number of parameters, the proposed network achieves the best accuracy, obtaining nearly 1% improvement compared with the best-performing baseline model. Based on this design, an ultra-lightweight version with only 7k parameters is further developed by incorporating a compact multi-scale module, improving accuracy by 1% over the version without the multi-scale structure. Moreover, through heterogeneous knowledge distillation and adaptive iterative training, the accuracy of the ultra-lightweight model is further increased from 93.3% to 94.0%. Finally, the model is deployed and validated on an AI hardware acceleration platform. The results demonstrate that the proposed method substantially improves inference throughput while maintaining high accuracy, providing a practical solution for real-time, low-latency, on-device transient-source detection under large data volume and limited transmission conditions. Specifically, the proposed models are trained offline on a high-performance GPU and subsequently deployed on the Fudan Microelectronics 7100 AI board to evaluate their real-world inference efficiency on resource-constrained edge devices.