Search for a command to run...
Accurate paddy yield prediction is essential to support food security, agricultural planning, and data-driven decision-making. The increasing availability of agricultural data has encouraged the adoption of machine learning approaches to overcome the limitations of conventional yield estimation methods. This study presents a comparative analysis of five regression-based machine learning algorithms—Linear Regression, K-Nearest Neighbors Regressor, Decision Tree Regressor, Random Forest Regressor, and Support Vector Regression—for paddy yield prediction. The experiments were conducted using the Paddy dataset from the UCI Machine Learning Repository, which consists of 2,789 samples and 45 variables (44 input features and 1 target variable). The dataset was preprocessed through data cleaning, feature standardization, and an 80:20 train–test split. Model performance was evaluated using Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, and the coefficient of determination (R²). Experimental results show that Linear Regression achieved the best overall performance with an R² value of 0.9896 and an RMSE of 942.09, indicating strong predictive accuracy and stability. Despite its simplicity, Linear Regression outperformed more complex models, suggesting that the underlying relationships between input variables and paddy yield in the dataset are predominantly linear. These findings highlight the importance of systematic model evaluation and demonstrate that simpler regression models can remain effective and interpretable for practical paddy yield prediction and agricultural decision support systems.
Published in: Bulletin of Intelligent Machines and Algorithms
Volume 1, Issue 3, pp. 93-100