Search for a command to run...
Accurate prediction of departure flight taxi-out time is critical for enhancing airport surface efficiency and reducing flight delays. However, existing methods often struggle with data sparsity, inadequate representation of complex spatio-temporal interactions among aircraft, and imbalanced sample distributions. To address these challenges, this paper proposes a synergistic multi-module fusion model named DDA-SIM-ATT-CatBoost. The model integrates three core modules: a Dynamic Data Augmentation (DDA) module that expands the training distribution through operationally consistent perturbations to mitigate data imbalance; a Similarity Theory (SIM) module employing K-Prototypes clustering and Mahalanobis distance to achieve precise matching of historical operational patterns; and an Attention Mechanism (ATT) module that dynamically recalibrates feature weights to emphasize critical influencing factors. These modules work synergistically to provide a robust and discriminative input representation for the CatBoost regressor, which excels at handling categorical features and complex nonlinearities. Using real-world departure data from a major hub airport, the proposed model achieves prediction accuracies of 74.57%, 89.12%, and 97.76% within error margins of ±120 s, ±180 s, and ±300 s, respectively, with a Mean Absolute Percentage Error (MAPE) of 10.34%, Mean Absolute Error (MAE) of 87.55 s, and Root Mean Square Error (RMSE) of 125.61 s. Ablation studies validate the positive contribution and synergistic effect of each module, while comparative experiments demonstrate that our model significantly outperforms baseline models such as XGBoost and Random Forest. The DDA-SIM-ATT framework provides a generalizable and high-precision solution for taxi-out time prediction, offering reliable decision support for airport surface operations.