Search for a command to run...
<b>Background/Objectives</b>: Analysis of histopathological images is the absolute standard of breast cancer diagnosis. However, modern deep learning- and ViT-based architecture still struggle to capture effective local and global discriminatory patterns that tend to make architecture more complex, increasing the risk of overfitting and optimization problems. <b>Methods</b>: To address these problems, this paper proposes a four-phase hybrid framework that aims to enhance the feature fusion, improving the model's strength, robustness, and generalization ability. In Phase 1, the BreakHis dataset was split patient-wise into a 70-15-15 manner to avoid data leakage, while extensive data augmentation, comprehensive normalization, and a five-fold cross-validation protocol were implemented to make the dataset more varied and reliably evaluated without bias. Phase 2 entailed the training of three CNNs (VGG16, ResNet50, and DenseNet121) and four ViTs (DeiT, CaiT, T2T-ViT, and Swin Transformer) independently to establish the strict baseline performance standards. In Phase 3, the CNN-based features were fused and classified with a soft voting mechanism to allow more stable and representative learning. Phase 4 depicts the Proposed Framework, which combines the two best-performing CNN and ViT models. Feature refinements were performed randomly by using Global Average Pooling and feature scaling, while a self-attention mechanism enabled the accurate cross-modal feature fusion. The generalization capability of the fused representation was further enhanced by the subsequent of dense layers followed by dropout. <b>Results</b>: XGBoost exhibited the highest performance among the evaluated ML classifiers, achieving 98.7% accuracy and 98.7% F1-score on BreakHis, while achieving 95.8% accuracy on external BACH dataset backed by Grad-CAM- and Grad-CAM++-based interpretability. <b>Conclusions</b>: By integrating CNNs and ViTs through self-attention, the proposed framework offers a robust and interpretable solution for automated breast cancer diagnosis.