Correction: LSML-SF: a lightweight stacked ML approach for spreading factor allocation in mobile IoT LoRaWAN networks

20260 citationsJournal Articlegold Open Access

Authors

Arshad Farhad · Bahria University

Muhammad Ali Lodhi · Yangzhou University

Farhan Nisar · Numerical Method (China)

Hassan Jalil Hadi · Prince Sultan University

Naveed Ahmad · Prince Sultan University

Mohamad Ladan · Prince Sultan University

Abstract

The vision of a seamlessly connected world is rapidly materializing through the Internet of Things (IoT), which has integrated into countless consumer domains. From smart agriculture and logistics to personalized healthcare and smart city infrastructure, IoT devices are revolutionizing how consumers interact with their environment (Almuhaya et al., 2022). A critical enabler of this revolution is low-power wide-area network (LPWAN) technology, which provides the essential connectivity for billions of devices. Among various LPWAN solutions, Long-Range Wide Area Network (LoRaWAN) has gained significant attention. This is due to its compelling trade-off between communication range, power consumption, and device cost (Farhad et al., 2020;Butt et al., 2025). Consequently, LoRaWAN is exceptionally suitable for consumer electronics applications.Figure 1 shows the basic architecture of a LoRaWAN network. End devices (EDs) transmit uplink data to a gateway (GW) using LoRa modulation, with spreading factors (SF) typically ranging from SF7 to SF12. The choice of SF directly affects communication behavior: lower SF values, such as SF7, provide higher data rates over shorter distances, whereas higher SF values, such as SF12, increase communication range at the cost of longer airtime and reduced throughput. LoRaWAN operates in a star topology, where EDs access the medium using an ALOHA-based uplink transmission scheme. After each uplink transmission, two receive windows are opened to enable downlink communication from the network server (NS) via the GW. The first window (RX1) opens 1 s after the uplink using the same channel and SF, while the second window (RX2) opens after 2 s on a predefined channel with SF12. These receive windows allow the NS to deliver acknowledgments or control messages when required. The GW connects to the NS through LTE or Ethernet backhaul links, which subsequently relay processed data to the application server for advanced processing. This structure supports robust, low-power connectivity for extensive IoT deployments, as detailed in recent surveys on LoRaWAN scalability (Jouhari et al., 2023), machine learning enhancements (Farhad and Pyun, 2023b), security vulnerabilities (Hessel et al., 2022), and adaptive data rate (ADR) optimizations (Lehong et al., 2024).Within the LoRaWAN architecture (as illustrated in Figure 1), the NS implements the ADR mechanism, which dynamically adjusts the SF and transmission power (TP) based on the signalto-noise ratio (SNR) history of the last 20 packets received from each ED. This adjustment aims to optimize network performance by selecting an appropriate SF and TP configuration that balances range, data rate, and energy consumption, though its efficacy is challenged by the dynamic nature of real-world consumer environments. However, the dynamic nature of realworld consumer environments, characterized by device mobility, signal fading, and interference, poses a significant challenge to traditional ADR. Its slow reaction time and inability to model complex channel dynamics often lead to suboptimal SF assignments, resulting in packet loss, increased network congestion, and accelerated battery drain (Rehman et al., 2025;Ullah et al., 2025). This directly degrades the performance and user experience of consumer IoT products.Machine learning (ML) offers a powerful paradigm to overcome these limitations by learning complex patterns from data for optimal SF allocation. Recent works have explored models like deep neural networks (DNNs) (Farhad and Pyun, 2023a) and gradient boosting (Minhaj et al., 2023). However, a significant gap remains in developing a solution that is not only highly accurate but also demonstrably feasible for deployment on the computationally constrained microcontrollers that are ubiquitous in consumer EDs. Many high-accuracy models are too complex for practical implementation, while simpler models may lack the required predictive performance. This study addresses this gap by introducing a high-performance, yet deployable, stacked ensemble ML framework for SF classification. Our approach does not rely on a single model but leverages the complementary strengths of multiple learners to achieve superior and robust accuracy. Specifically, we present the lightweight stacked-ML approach for SF (LSML-SF) allocation in mobile IoT LoRaWAN networks. The proposed LSML-SF combines a linear SGD classifier (functioning as a support vector machine), a gradient boosting (XGBoost) model, and a DNN through a logistic regression meta-learner.The key contributions of the proposed LSML-SF are as follows:1. We propose and implement a sophisticated stacked generalization pipeline that combines three diverse base learners: a linear SGD classifier, an XGBoost model, and a DNN. A logistic regression meta-learner is trained to optimally blend the predictions from these base models. This architecture is specifically designed to capture the complex, nonlinear relationships between device state, channel conditions, and the optimal SF, achieving an overall classification accuracy of approximately 88% across all six SF classes. 2. We develop a robust feature engineering strategy that expands a set of 5 base features (e.g., device location, distance, received power, SNR) into a rich set of 29 features. This includes rolling statistics (mean, std, min, max) to capture temporal dynamics and domain-informed interaction terms (e.g., distance × SNR) and nonlinear transformations (e.g., log(1 + Distance)). This process provides the model with a highly informative input representation that is critical for achieving high accuracy. 3. We validate the effectiveness of our LSML-SF framework by integrating the pre-trained model into the ns-3 network simulator. Performance evaluation under mobile scenarios shows that our LSML-SF approach consistently and significantly outperforms traditional ADR mechanisms as well as other MLbased benchmarks, achieving improvements in packet success ratio (PSR) and reducing overall network energy consumption.The rest of the study is organized as follows: Section 2 reviews existing ML approaches for LoRaWAN parameter optimization and identifies key research gaps. Section 3 introduces the proposed LSML-SF framework, covering dataset generation, feature engineering, model architecture, and training strategy. Section 4 evaluates the offiine predictive performance of the stacked ensemble, including confusion matrix analysis, convergence, and deployment feasibility. Section 5 reports online ns-3 simulation results, highlighting improvements in packet success ratio, energy consumption, and packet loss ratios under mobility. Section 6 outlines current study limitations and future directions, while Section 7 concludes with key findings and implications for realworld IoT deployment.This section surveys recent ML approaches for optimizing communication parameters in LoRa and LoRaWAN networks, including SF, TP, bandwidth (BW), and coding rate (CR). The primary goal of these methods is to enhance overall network efficiency and performance. A comparative summary of these approaches is presented in Table 1.In Azizi et al. (2022), the authors investigated dynamic SF allocation in LoRaWAN using a reinforcement learning approach referred to as MIX-MAB. Their implementation relied on the LoRa-MAB Python simulator and focused on a single-gateway deployment with 100 static end devices uniformly distributed within a 4.5 km radius. The study followed the EU-868 MHz duty-cycle restriction of 1% and assumed a traffic profile of 15 uplink packets per hour with payload sizes of 50 B. To simplify the analysis, the evaluation was conducted under idealized conditions without ACK collisions. Within this controlled setting, the proposed approach achieved higher packet delivery ratios and improved energy efficiency when compared with existing RL-based baselines.Building on reinforcement learning for parameter adaptation, the work in Chen et al. (2023) introduced the score table-based evaluation and parameter surfing (STEP) algorithm. The evaluation was carried out using the MULANE simulator in MATLAB, where STEP was benchmarked against standard ADR, Blind ADR (BADR), and LoRa-MAB. The results showed a noticeable reduction in energy consumption, ranging from 24% to 27%, highlighting the potential of table-driven learning strategies for SF optimization.A different application of ML in LoRaWAN was explored in Bertocco et al. (2023), where the authors targeted underground monitoring scenarios. Their study used a laboratory-generated dataset collected from a sand-filled environment with varying soil moisture levels. Received Signal Strength Indicator (RSSI) measurements and moisture sensor readings were employed to train and evaluate several ML-based estimation strategies. Specifically, the authors compared sensor calibration using ML, virtual sensing based solely on RSSI, and a hybrid approach combining physical sensor data with RSSI information. Among these, the hybrid method achieved the lowest estimation error, with an RMSE of 1.53%, outperforming both the sensor-only and RSSI-only approaches.A gated recurrent unit (GRU)-based solution for efficient resource management, specifically SF allocation, was proposed in Farhad et al. (2022b) to improve the packet success ratio (PSR) of LoRaWAN networks. The dataset for this study was generated using the ns-3 simulator and included 500 static EDs. It comprised four key features: X-coordinate, Y-coordinate, signal-to-noise Ratio (SNR), and received power. The GRU model architecture consisted of two layers and one fully connected layer, followed by a softmax activation function for classification. This model achieved a classification accuracy of 96%. The weights and biases from the best-performing model were saved and later integrated into the ns-3 simulator for dynamic SF allocation during network simulations. As a result, the proposed GRU method achieved a high PSR of 98% for a network of 100 static EDs and 73% for a larger network of 600 EDs.Expanding on this work, the authors in Farhad and Pyun (2023a) employed a DNN model tailored for both static and mobile LoRaWAN networks, utilizing the same dataset as in Farhad et al. (2022b). The dataset was partitioned into groups based on successful ACK reception. Each group was labeled with its single most efficient SF, and this processed data was used to train the DNN. Their model comprised five fully connected layers with varying numbers of neurons and a final softmax layer for SF classification, achieving an accuracy of 82%. When this pre-trained model was deployed within the ns-3 simulator for live network simulations, it demonstrated superior performance in PDR, energy consumption, and convergence period compared to traditional methods like ADR and other ML approaches like SVM. Consequently, the DNN-based approach outperformed GRU, LSTM, and SVM models.In Prakash (2025), the authors addressed SF prediction in large mobile LoRaWAN-based IoT networks through effective feature selection. Using a publicly available dataset with over 930,000 datapoints, they evaluated k-nearest neighbors (k-NN), decision tree classifier (DTC), random forest (RF), and multinomial logistic regression (MLR) across 31 feature combinations from key parameters like RSSI, SNR, frequency, distance, and antenna height. The RSSI and SNR combination emerged as optimal, achieving high accuracy and F1 scores. This work highlights reduced dataset collection costs and extended battery life for LoRaWAN devices.The authors in González-Palacio et al. (2023) proposed MLbased models for combined path loss and shadowing in LoRaWAN to enhance energy efficiency. Incorporating environmental variables such as temperature, relative humidity, barometric pressure, particulate matter, and SNR, they fitted models using multiple linear regression (MLR), support vector regression (SVR), random forests (RF), and artificial neural networks (ANN). Achieving RMSE up to 1.566 dB and R 2 up to 0.94, their approach improved the ADR algorithm, reducing link margin and saving up to 43% energy compared to traditional ADR.For hybrid techniques, Hazarika and Choudhury (2024) introduced the intelligent spreading factor allocation (iSFA) approach for mobile and static EDs in LoRa-based networks. Combining K-means clustering at EDs (based on features like unique ED ID, SF, SNR, RSSI, energy ratio, and packet success) with RL at GWd (optimizing DR, TP, and latency), iSFA reduced packet loss, convergence time, and energy consumption while improving throughput and PSR in simulations.While the existing literature demonstrates significant progress in optimizing LoRaWAN parameters through ML, three critical research gaps remain unaddressed: (1) Interpretability-Prior works (Farhad et al., 2022b;Farhad and Pyun, 2023a;Azizi et al., 2022) focus predominantly on performance metrics without providing for the which and deployment in critical et al., et al., dynamic or classification and resource as and methods et al., achieve high accuracy but often at high costs for EDs. this work directly addresses and trade-off through a stacked learning framework and lightweight (1) model remains an challenge for ensemble and deep ADR solutions, including the proposed this study on improving and energy efficiency under mobility, while as an for future research in LoRaWAN resource section the proposed LSML-SF framework for SF classification in LoRaWAN networks. The framework is designed to predictive accuracy with it suitable for deployment in EDs. As illustrated in Figure the overall pipeline of dataset feature engineering, ensemble model and online deployment within the ns-3 dataset used for training and was generated using a LoRaWAN network simulator 1 et al., et al., et al., This choice with LoRaWAN and A of the dataset is in Table simulation of 500 static EDs uniformly distributed within a of 5 with a single gateway (GW) at the each ED location, six packet transmission are using spreading factors ranging from SF7 to SF12. Each to a single transmission and includes the ED packet group distance to the GW received signal power signal-to-noise ratio (SNR), and the SF used for The for learning is the optimal spreading as SF a ED with transmission SF is as the lowest SF that results in a successful This the of communication while airtime and energy The is in improve predictive performance a of 29 features are base feature set is using temporal statistics and nonlinear each base feature rolling statistics with window 5 are on a to capture temporal rolling rolling SF such that (1) rolling SF12, SF are as to using a To weights are with on higher due to their on network and energy The is is the of is the of in is the of and is a factor rolling domain-informed interaction terms and nonlinear transformations are included to channel × SNR × SNR log(1 + log(1 + these features enable the model to capture nonlinear relationships between device location, channel conditions, and the optimal spreading The proposed LSML-SF framework a stacked generalization strategy in which multiple base learners are trained in and their are combined by a This is to complementary biases while reducing the of with on a single Each base a vector over the six SF and these are later using predictions to A summary of the base their key and their within the stacked ensemble is in Table 3. As illustrated in Figure the three base learners are trained in within a and their are combined by a logistic regression The first base is a linear classifier trained 1 to an multinomial logistic regression model a SVM. The training pipeline includes to and feature to feature to This model provides a and linear that in the feature and to the second base is a decision tree model using XGBoost and The model is with 600 and a tree of tree is employed to improve training while and are to enhance through and under and channel This is for nonlinear and base is a deep neural network designed with deployment in The network architecture an input layer followed by two fully connected layers with and Each layer activation and a rate of to A final softmax layer over the six SF accurate classification of higher which have a on airtime and energy consumption, the DNN is trained using a loss et al., This where the to the for 2 is the and is the in 2. optimization is using the with to convergence and is employed to predictions from each base This that the meta-learner is trained on predictions from As in Figure predictions from all base models are to the matrix R A multinomial logistic regression model is trained on to the optimal combination of base all base models are using the dataset to the final 4 the of the trained LSML-SF model within the ns-3 simulation When an ED a packet transmission, including distance to the and SNR, are The features in Section are and as input to the pre-trained on this the model the optimal spreading factor SF which is subsequently to the The resulting packets are received at the GW and to the network server (NS) for standard LoRaWAN processing. This adaptive SF during simulation and offiine model training with online network gradient with This This section evaluates the offiine performance of the proposed stacked ensemble for SF classification. We through analysis, DNN training dynamics across and the and to deployment on constrained IoT devices.The performance of the final model, using is by the confusion matrix in Figure the model across SF with high accuracy for on Figure Table 4 reports the across all the model per and confusion between these lower remains 1% of the This is lower support higher data rates and airtime when link conditions A is up to whereas and Specifically, of the are as and of the are as This is with the that and the most robust and are typically under link practical and is selecting an high SF for a link that at the to the robust for at the cost of increased both training and and the accuracy remain without noticeable The of relative to training and that the DNN well under the and optimization typically within with the The model in Figure the same the of training when the dataset is deployment on constrained we and The DNN base is the most of the ensemble and is in Table It parameters and per This is with that of per second under 6 and across training and offiine the feature matrix for and 29 features approximately in the to the and to the by the XGBoost while the DNN and the linear models remain 5 deployment time, only the trained of the ensemble are and the large feature used during offiine training are not in Consequently, is by the model parameters and required for feature this study reports model sizes is for each packet transmission as of SF selection. However, that an increase airtime and energy consumption by an of the of approximately per decision is relative to the energy achieved through reduced and shorter evaluate the online network of LSML-SF using ns-3 that the trained The on packet success ratio (PSR) in and energy consumption per transmission, which and energy efficiency under and network We EDs in under a deployment with a 5 km radius. To mobility, a model is used (Farhad et al., Each ED six uplink packets per hour over a simulation and results are over et al., et al., The set Ratio (PSR) the of end devices under for LSML-SF and ADR strategies. of simulation parameters is in Table The simulation including network packet rate, and transmission were to in LoRaWAN performance and IoT deployment scenarios et al., et al., 2025). These are with and to mobile LoRaWAN under conditions et al., et al., et al., metrics are used to network behavior: (1) packet success ratio (PSR) in and energy consumption per transmission, energy efficiency per packet under the SF is evaluated as the of EDs from to As in Figure LSML-SF higher PSR ADR, and (Farhad and Pyun, LSML-SF PSR to and to at 600 as and ADR and at high its adjustment strategy and on link (Farhad et al., The stacked ensemble combines complementary decision patterns from classification, gradient and the resulting in SF that link and the range, LSML-SF provides a PSR over and at to high network (Farhad and Pyun, shows the energy consumption per transmission as the network from to 600 devices. LSML-SF the lowest energy consumption across the range, with SF that airtime and under mobility. lower the energy of and remain efficiency when is As the and LSML-SF a lower energy profile as As was not these are as across simulation ADR and SVM higher energy consumption at to high due to packet and et al., et al., packet delivery packet loss ratios are into four packet due to that the at the GW. Figure reports PSR and the over for and 600 PSR remains to across the and remain the higher traffic all with and most and PSR This highlights the combined of and GW resource as the network under the proposed LSML-SF framework demonstrates significant improvements in classification accuracy and network this study is to several limitations that provide a for future the network used for evaluation was to a This while for does not capture the of networks. such environments, critical factors including GW and the performance of the SF allocation the proposed LSML-SF framework was for A for EDs to the ML model on the work the pre-trained model into the NS and the existing LoRaWAN control framework to downlink optimal SF to the the from the the primary contributions of this study in improved and energy while is as an for future research in LoRaWAN resource allocation. The evaluation presented in this study is based on ns-3 simulations, while in LoRaWAN fully capture real-world factors such as and environmental of the proposed framework using real-world and as an for future study presented a lightweight for SF allocation in The approach combines a linear classifier, gradient and a DNN through trained on a dataset with 29 features. the model achieved approximately when integrated into it consistently improved packet success ratio (PSR) and reduced energy per transmission across mobile end devices compared to ADR, and ML via packet loss ratios and how the method interference, and under and the of the for constrained The DNN base parameters and per while the pipeline in ns-3 by online input from the same feature used These LSML-SF as a practical path SF control in dynamic environments. work evaluation to with and duty-cycle the to recent ADR and reinforcement learning methods with provide on feature groups and model and and for the ensemble under or to the network server while these LSML-SF from a approach to a solution for mobile IoT networks.

Topics & Keywords

IoT Networks and Protocols Advanced Wireless Communication Technologies Molecular Communication and Nanonetworks

Publication Details

Published in: Frontiers in Artificial Intelligence

Volume 9

DOI: 10.3389/frai.2026.1819135

Field-Weighted Citation Impact: 0.00

Command Palette

Correction: LSML-SF: a lightweight stacked ML approach for spreading factor allocation in mobile IoT LoRaWAN networks

Authors

Abstract

Topics & Keywords

Publication Details