SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY

Tao Yan; Wenan Zhang; Simon X. Yang; Li Yu

doi:10.2316/J.2019.206-0216

SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY

Tao Yan, Wenan Zhang, Simon X. Yang, and Li Yu

Keywords

Reinforcement learning, maximum entropy, robotic manipulation,hindsight experience replay

Abstract

The key challenges in applying reinforcement learning (RL) to complex robotic control tasks are the fragile convergence property, very high sample complexity and the need to shape a reward function. In this work, we present a soft actor-critic (SAC) style algorithm, an oﬀ-policy actor-critic RL method based on the maximum entropy RL framework, where the objective of the actor is to maximize the expected reward while also maximizing the entropy. This eﬀectively improves the stability of the performance of algorithm and the robustness to modelling and estimate error. Moreover, we combine SAC with a new transition replay scheme called hindsight experience replay so as to make policy learning more eﬃcient from sparse rewards. Finally, the eﬀectiveness of the proposed method is veriﬁed on a range of manipulation tasks in simulated environment.

Important Links:

References
DOI: 10.2316/J.2019.206-0216
From Journal (206) International Journal of Robotics and Automation - 2019

Go Back