Production-Scalable Control Optimisation for Optical Switching With Deep Reinforcement Learning

20233 citationsJournal Article

Authors

Zacharaya Shabka · University College London

Michael Enrico · Polatis (United Kingdom)

Paulo Ricardo Lisboa de Almeida · Polatis (United Kingdom)

Georgios Zervas · University College London

Abstract

Proportional-integral-derivative(PID) control underlies <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">${>}95\%$</tex-math></inline-formula> of automation across many industries including high-radix optical circuit switches based on PID-controlled piezoelectric-actuator-based beam steering. To meet performance metric requirements (switching speed and actuator stability for optical switches) PID control requires three parameters to be optimally tuned (aka PID tuning). Typical PID tuning methods involve slow, exhaustive and often hands-on search processes which waste engineering resources and slow down production. Moreover, manufacturing tolerances in production mean that actuators are non-identical and so controlled differently by the same PID parameters. This work presents a novel PID parameter optimisation method (patent pending) based on deep reinforcement learning which avoids tuning procedures altogether whilst improving switching performance. On a market leading optical switching product based on electromechanical control processes, compared against the manufacturer's production parameter set, average switching speed is improved 22% whilst <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$5\times$</tex-math></inline-formula> more (17.5% to 87.5%) switching events stabilise in <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\leq \text{20}\,\text{ms}$</tex-math></inline-formula> (the ideal worst-case performance) without any practical deterioration in other performance metrics such as overshoot. The method also generates actuator-tailored PID parameters in <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\mathbf {O}(milliseconds)$</tex-math></inline-formula> without any interaction with the device using only generic information about the actuator (known from manufacturing and characterisation processes). This renders the method highly applicable to mass-manufacturing scenarios generally. Training is achieved with just a small number of actuators and can generally complete in <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\mathbf {O}(hours)$</tex-math></inline-formula> , so can be easily repeated if needed (e.g. if new hardware is built using entirely different types of actuators).

Topics & Keywords

Iterative Learning Control Systems Semiconductor Lasers and Optical Devices Scheduling and Optimization Algorithms

Publication Details

Published in: Journal of Lightwave Technology

Volume 42, Issue 6, pp. 2018-2025

DOI: 10.1109/jlt.2023.3328330

Field-Weighted Citation Impact: 0.55

Command Palette

Production-Scalable Control Optimisation for Optical Switching With Deep Reinforcement Learning

Authors

Abstract

Topics & Keywords

Publication Details