ActorQ: Quantization for Actor-Learner Distributed Reinforcement Learning


M. Lam, et al., “ActorQ: Quantization for Actor-Learner Distributed Reinforcement Learning,” in ICLR, Hardware Aware Efficient Training Workshop at ICLR 2021, Virtual, May 7, 2021.
PDF2.34 MB
Poster457 KB


In this paper, we introduce a novel Reinforcement Learning (RL) training paradigm, ActorQ, for speeding up actor-learner distributed RL training. ActorQ leverages full precision optimization on the learner, and distributed data collection through lower-precision quantized actors. The quantized, 8-bit (or 16 bit) inference on actors, speeds up data collection without affecting the convergence. The quantized distributed RL training system, ActorQ, demonstrates end to end speedups of > 1.5 × - 2.5 ×, and faster convergence over full precision training on a range of tasks (Deepmind Control Suite) and different RL algorithms (D4PG, DQN). Finally, we break down the various runtime costs of distributed RL training (such as communication time, inference time, model load time, etc) and evaluate the effects of quantization on these system attributes.

Last updated on 05/27/2021