Search for a command to run...
Optimizing energy-intensive manufacturing under time-varying electricity tariffs requires scheduling strategies that reduce cost without compromising operational feasibility. This study is grounded in readily available industrial sensing: we exclusively use time-series measurements of aggregated active power and energy at the main distribution board of a quicklime production plant. We propose a tariff-aware load-shifting framework in which a Proximal Policy Optimization (PPO) reinforcement learning agent is trained in a custom Gymnasium environment to apply discrete consumption scaling actions constrained to 80-125% of a baseline profile during the operating shift (08:00-16:00), explicitly accounting for demand-charge exposure in the TOU peak window (13:00-15:00). The reward design combines instantaneous electricity cost with cumulative energy-tracking penalties and terms associated with operational constraints. Multi-day validation over N=30 working days shows consistent economic benefits, with a median total cost reduction on the order of 10% (narrow IQR) driven by reduced peak-window energy and demand peaks. However, the script-based binary compliance indicators (viol_energy, viol_prod_min) reveal deviations from the energy-balance criterion and occasional minimum-production shortfalls under the tolerances used, highlighting the cost-production trade-off and the need for stricter constraint handling for industrial deployment. In addition, we benchmark against dynamic programming (DP), an alternative RL policy (DQN), and a greedy heuristic (GREEDY), comparing cost; operational performance; and, when applicable, computational efficiency, which positions PPO as a competitive alternative among the considered methods. Overall, this work demonstrates how learning-based decision making can be coupled with real-world industrial sensing infrastructures, providing a data-driven tariff-aware scheduling layer for industrial energy management under practical constraints.