强化学习
计算机科学
CMA-ES公司
适应性
适应(眼睛)
趋同(经济学)
机器人
过程(计算)
运动规划
路径(计算)
人工智能
水准点(测量)
进化策略
协方差矩阵
动平衡
进化算法
机器学习
算法
工程类
经济
物理
光学
操作系统
生物
程序设计语言
机械工程
地理
经济增长
生态学
大地测量学
作者
Shuhuan Wen,Zeteng Wen,Di Zhang,Hong Zhang,Tao Wang
标识
DOI:10.1016/j.asoc.2021.107605
摘要
The adaptability of multi-robot systems in complex environments is a hot topic. Aiming at static and dynamic obstacles in complex environments, this paper presents dynamic proximal meta policy optimization with covariance matrix adaptation evolutionary strategies (dynamic-PMPO-CMA) to avoid obstacles and realize autonomous navigation. Firstly, we propose dynamic proximal policy optimization with covariance matrix adaptation evolutionary strategies (dynamic-PPO-CMA) based on original proximal policy optimization (PPO) to obtain a valid policy of obstacles avoidance. The simulation results show that the proposed dynamic-PPO-CMA can avoid obstacles and reach the designated target position successfully. Secondly, in order to improve the adaptability of multi-robot systems in different environments, we integrate meta-learning with dynamic-PPO-CMA to form the dynamic-PMPO-CMA algorithm. In training process, we use the proposed dynamic-PMPO-CMA to train robots to learn multi-task policy. Finally, in testing process, transfer learning is introduced to the proposed dynamic-PMPO-CMA algorithm. The trained parameters of meta policy are transferred to new environments and regarded as the initial parameters. The simulation results show that the proposed algorithm can have faster convergence rate and arrive the destination more quickly than PPO, PMPO and dynamic-PPO-CMA.
科研通智能强力驱动
Strongly Powered by AbleSci AI