SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY

Tao Yan, Wenan Zhang, Simon X. Yang, and Li Yu

Keywords

Reinforcement learning, maximum entropy, robotic manipulation,hindsight experience replay

Abstract

The key challenges in applying reinforcement learning (RL) to complex robotic control tasks are the fragile convergence property, very high sample complexity and the need to shape a reward function. In this work, we present a soft actor-critic (SAC) style algorithm, an off-policy actor-critic RL method based on the maximum entropy RL framework, where the objective of the actor is to maximize the expected reward while also maximizing the entropy. This effectively improves the stability of the performance of algorithm and the robustness to modelling and estimate error. Moreover, we combine SAC with a new transition replay scheme called hindsight experience replay so as to make policy learning more efficient from sparse rewards. Finally, the effectiveness of the proposed method is verified on a range of manipulation tasks in simulated environment.

Important Links:



Go Back