Search for a command to run...
High-dimensional tabular data are common in biomedical and clinical research, yet conventional machine learning methods often struggle in such settings due to data scarcity, feature redundancy, and limited generalization. In this study, we systematically evaluate Synolitic Graph Neural Networks (SGNNs), a framework that transforms high-dimensional samples into sample-specific graphs by training ensembles of low-dimensional pairwise classifiers and analyzing the resulting graph structure with Graph Neural Networks. We benchmark convolution-based (GCN) and attention-based (GATv2) models across 15 UCI datasets under two training regimes: a foundation setting that concatenates all datasets and a dataset-specific setting with macro-averaged evaluation. We further assess cross-dataset transfer, robustness to limited training data, feature redundancy, and computational efficiency, and extend the analysis to a real-world ovarian cancer proteomics dataset. The results show that topology-aware node feature augmentation provides the dominant performance gains across all regimes. In the foundation setting, GATv2 achieves an ROC-AUC of up to 92.22 (GCN: 91.22), substantially outperforming XGBoost (86.05), α=0.001. In the dataset-specific regime, GATv2, combined with minimum-connectivity filtering, achieves a macro ROC-AUC of 83.12, compared to 80.28 for XGBoost. Leave-one-dataset-out evaluation confirms cross-domain transfer, with an ROC-AUC of up to 81.99. SGNNs maintain ROC-AUC around 85% with as little as 10% of the training data and consistently outperform XGBoost in more extreme low-data regimes, α=0.001. On ovarian cancer proteomics data, foundation training improves both predictive performance and stability. Efficiency analysis shows that graph filtering substantially reduces training time, inference latency, and memory usage without compromising accuracy. Overall, these findings suggest that SGNNs provide a robust and scalable approach for learning from high-dimensional, heterogeneous tabular data, particularly in biomedical settings with limited sample sizes.