Overfitting Dynamics in Recurrent Neural Networks: A Statistical and Experimental Approach

20260 citationsJournal Articlediamond Open Access

Authors

Orgeta Gjermëni · University of Vlora "Ismail Qemali"

Abstract

Overfitting remains a key challenge in applying Recurrent Neural Networks (RNNs) to sequential forecasting tasks, including carbon emissions modeling. While Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures are widely used, comparative analyses of their overfitting behavior across architectural variants remain limited. The present analysis addresses this gap by examining how structural variables such as model architecture, layer depth, architecture type, model family, and the number of hidden units, together with their two-way interactions with unit configuration, influence overfitting behaviour in RNNs applied to a univariate time series of carbon dioxide (CO2) emissions from land-use change in Albania. The dataset covers the period from 1850 to 2022. Preprocessing steps included Isolation Forest-based outlier detection, LSTM-based imputation, first differencing, Yeo-Johnson transformation, and Min-Max normalization. Six RNN architectures were evaluated, including single-layer models (GRU and LSTM) and their homogeneous and hybrid two-layer variants. Each architecture was trained across hidden unit values, resulting in 402 model instances under a unified configuration. Overfitting ratios were calculated as the ratio of test to training values for each of the four performance metrics: root mean squared error, symmetric mean absolute percentage error, mean absolute scaled error, and normalized mean absolute error. Their distributional properties were also assessed. A non-parametric multivariate analysis of variance was conducted to examine both main effects and two-way interactions, followed by pairwise comparisons within statistically significant structural factors. The results showed that model architecture, layer depth, architecture type, and model family significantly influence overfitting behaviour. Although the number of hidden units did not have a significant main effect, consistent interaction effects suggest that their impact depends on the architectural configuration. Pairwise comparisons revealed that hybrid and homogeneous architectures differed significantly from simple models. No significant difference was found between hybrid and homogeneous architectures, indicating greater similarity between the latter two. These findings emphasize the importance of aligning architectural design with hyperparameter selection when developing RNN-based forecasting models. Methodologically, the study demonstrates the utility of multivariate non-parametric analysis in characterizing generalization behavior. The research insights provided practical guidance for constructing more robust and generalizable RNNs for environmental forecasting and similar applications. Received: 5 January 2026 / Revised: 20 February 2026 / Accepted: 3 March 2026 / Published: 25 March 2026

Topics & Keywords

Air Quality Monitoring and Forecasting Energy Load and Power Forecasting Hydrological Forecasting Using AI

Publication Details

Published in: Interdisciplinary Journal of Research and Development

Volume 13, Issue 1, pp. 136-136

DOI: 10.56345/ijrdv13n116

Field-Weighted Citation Impact: 0.00

Command Palette

Overfitting Dynamics in Recurrent Neural Networks: A Statistical and Experimental Approach

Authors

Abstract

Topics & Keywords

Publication Details