Search for a command to run...
Greenhouse climate control is essential for optimizing crop growth while minimizing resource consumption in controlled environment agriculture. Traditional rule-based and fixed-action strategies often struggle to achieve a balance between these objectives. This paper proposes a reinforcement learning (RL) based framework for greenhouse climate control, integrating deep learning models to predict both crop growth and resource consumption. The framework enables an RL agent to optimize greenhouse control setpoints dynamically, maximizing crop yield while ensuring sustainable resource usage. The proposed system incorporates a Multi-Layer Perceptron (MLP) model to predict internal greenhouse climate conditions, a Long Short-Term Memory (LSTM) model for crop parameter estimation, and a separate LSTM model for forecasting daily resource consumption. These models collectively simulate a greenhouse environment where an RL agent learns to regulate temperature, CO2 concentration, and irrigation levels by interacting with the virtual environment. A custom reward function is designed to guide the agent, considering key crop parameters; stem elongation, stem thickness, and cumulative trusses; alongside resource consumption metrics, including heating, electricity, CO2, and irrigation costs. To enhance the adaptability of the RL agent, a feature-selection mechanism identified the most influential climate and control features, reducing observation complexity and accelerating convergence. Retraining under stochastic weather conditions strengthened robustness to dynamic environments, enabling the agent to consistently outperform fixed-action strategies. Evaluation revealed a stable Pareto frontier between yield and resource consumption, confirming that the framework accurately captured the productivity and sustainability trade-off and remained robust across varying reward-weight settings. Comparative analysis of multiple RL algorithms; Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Soft Actor-Critic (SAC), and Twin Delayed Deep Deterministic Policy Gradient (TD3) demonstrated that TD3 outperforms other algorithms, achieving the highest cumulative rewards and reaching optimal policies faster. Experimental evaluations demonstrate that the proposed TD3 RL-based greenhouse control system achieves higher crop yield growth rates while optimizing resource usage, outperforming conventional greenhouse control strategies. This study presents a novel data-driven, adaptive greenhouse management approach, bridging the gap between crop growth modeling and autonomous climate control, contributing to sustainable and intelligent agricultural practices.