强化学习
计算机科学
残余物
机器学习
人工智能
高斯过程
重新使用
功能(生物学)
高斯分布
算法
工程类
量子力学
进化生物学
生物
物理
废物管理
作者
Yanqin Ma,De Xu,Fangbo Qin
标识
DOI:10.1109/tii.2020.3020065
摘要
Multiple peg-in-hole insertion control is one of the challenging tasks in precision assembly for its complex contact dynamics. In this article, an insertion policy learning method is proposed for multiple peg-in-hole precision assembly. The insertion policy learning process is separated into two phases: initial policy learning and residual policy learning. In initial policy learning, a state-to-action policy mapping model based on the Gaussian mixture model (GMM) is established. And Gaussian mixture regression (GMR) is used to generalize the policy reuse. In residual policy learning, a reinforcement learning method named normalized advantage function (NAF) is employed to refine the insertion policy via agent's exploration in the insertion environment. Moreover, an adaptive action exploration (AAE) strategy is designed to improve the performance of exploration, and the prioritized experience replay strategy is introduced to make the residual policy learning from historical experience more efficient. Besides, the hierarchical reward function is designed considering the contact dynamics as well as the efficiency and safety of precision insertion. Finally, comprehensive experiments are conducted to validate the effectiveness of the proposed insertion policy learning method.
科研通智能强力驱动
Strongly Powered by AbleSci AI