Search for a command to run...
Next-generation wireless systems are expected to be artificial intelligence (AI)-native, in that they will embed machine learning (ML) and AI techniques from the application layer down to the physical layer. However, training and deploying ML models in wireless networks presents two key challenges pertaining to the limited computing and resources of wireless devices and systems, and the scarce and private nature of wireless data. First, ML models at the application layer, e.g., on-device AI, often require private data from distributed devices. One can resort to distributed ML algorithms such as federated learning (FL) by communicating only ML model parameters over wireless networks without sharing raw data. However, devices and communication networks have limited resources, in terms of computing, energy, bandwidth, and memory to support complex distributed ML algorithms. Second, next-generation wireless networks can potentially leverage a broad range of sensing modalities, such as LiDAR, images, or GPS, to make situation-aware network decisions under dynamic environments. To perform such wireless multi-modal data fusion, it is natural to leverage ML models and frameworks. However, the amount of training data is often scarce in wireless networks. As such, trained ML models often fail to generalize under unseen wireless environments. Moreover, despite providing more information about the current environment, multi-modal data also increases the amount of input features that must be processed by ML models. As such, multi-modal ML can suffer from large inference latency, thereby making obsolete network decisions in rapidly changing communication environments. The main contribution of this dissertation is, thus, to address these challenges by developing efficient and distributed ML frameworks that can be deployed over resource-constrained wireless networks with private and scarce multi-modal data. From the perspective of distributed, resource-efficient ML models, this dissertation first investigates energy-efficient distributed ML algorithms that operate over realistic wireless networks through the co-design of computing, communication, and learning algorithms. In particular, a novel energy-efficient FL framework is proposed to reduce the energy cost of training and communication by quantizing the neural network weights and activations. In this framework, every device trains a quantized neural network, which quantizes weights and activations to a limited precision level. The results show that the proposed framework can reduce energy consumption without jeopardizing the convergence rate by up to 70% compared to a baseline FL algorithm that does not use quantization. Subsequently, to optimize sparse model structures with low computational overhead, SpaFL: a communication-efficient FL framework is proposed. In SpaFL, a trainable threshold is defined for each filter/neuron to prune its all connected parameters, thereby leading to structured sparsity. The results showcase that SpaFL improves accuracy while requiring much less communication and computing resources compared to sparse baselines. This dissertation then investigates a large language model selection framework to optimize cost, latency, and response quality over 5G networks. In particular, a measurement-driven training framework is proposed for an AI-enabled router on a mobile device. The results show that the proposed framework can improve the cost and latency significantly with minimal response quality loss. From the perspective of efficient ML models, this dissertation first designs, efficient multi-modal learning frameworks are designed for improving ML generalization with scarce data and inference latency in wireless networks. In particular, a novel and data-efficient two-phase learning framework is proposed to improve generalization in unseen and unfamiliar wireless environments with a minimal amount of multi-modal data. In the first stage, a physics-based loss function is employed to enable each base station (BS) to learn the physics underlying its wireless environment captured by multi-modal data. In the second stage, collaborative domain adaptation is proposed to leverage the wireless environment knowledge of multiple BSs to guide under-performing BSs under domain shift. The results showcase that the proposed frameworks require significantly smaller amount of data and computing resources to achieve the convergence with better generalization. Next, a novel continual learning (CL) framework is proposed to achieve robust generalization to dynamic environments while retaining past knowledge. To this end, an agent estimates the distribution of risks over environmental change so as to obtain predictors that are robust to unseen changes. The results show that the proposed algorithm outperforms traditional CL baselines across all environments while significantly improving the generalization performance on unseen target environments. Lastly, a fast multi-modal transformer inference framework is designed to practically support wireless communication tasks by processing only important tokens. To validate the feasibility of the proposed framework for real-world deployments, one of the first multi-modal handover dataset is developed using a real-world testbed. The results show that the proposed framework can improve the inference latency by 86% compared to baselines with negligible performance loss. Overall, this dissertation develops a suite of efficient distributed and multi-modal ML frameworks that can be deployed in practical, real-world, and resource-constrained wireless networks.