Search for a command to run...
Microservices suffer from the execution of auxiliary operations known as datacenter tax, such as RPC and TCP processing, and data (de)serialization, (de)encryption, and (de)compression. To minimize this tax, multiple hardware accelerators have been proposed. However, it is unclear how these accelerators should be orchestrated. Past work has focused only on orchestrating accelerators in coarse-grained environments with monolithic applications. In this paper, we characterize the needs of orchestrating an ensemble of on-package accelerators in microservice environments. We observe that orchestration frameworks need to be highly dynamic and nimble. The basic operations to be accelerated are fine grained, potentially taking only tens of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mu \mathrm{s}$</tex>. Moreover, the sequence of accelerators to use is often affected by “branch conditions” whose real-time resolution determines the set of subsequent accelerators needed. To address these challenges, we present AccelFlow, the first orchestration framework for onpackage accelerators of microservices. In AccelFlow, CPU cores build software structures called Traces that contain sequences of accelerators to call. A core enqueues a trace in an accelerator in user mode and, from then on, the accelerators in the trace execute in sequence without CPU involvement. A trace can include branch conditions whose outcomes determine the trace control flow. Compared to state-of-the-art accelerator orchestrators, AccelFlow on average reduces P99 tail latency by 70 %, reduces average latency by 38 %, and increases throughput by 120 %.