亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Dual-Arm Robot Trajectory Planning Based on Deep Reinforcement Learning under Complex Environment

强化学习 避障 机器人 弹道 人工智能 计算机科学 避碰 功能(生物学) 障碍物 人工神经网络 增强学习 任务(项目管理) 机器人控制 模拟 碰撞 移动机器人 工程类 物理 计算机安全 政治学 法学 系统工程 生物 进化生物学 天文
作者
Wanxing Tang,Chuang Cheng,Haiping Ai,Li Chen
出处
期刊:Micromachines [Multidisciplinary Digital Publishing Institute]
卷期号:13 (4): 564-564 被引量:16
标识
DOI:10.3390/mi13040564
摘要

In this article, the trajectory planning of the two manipulators of the dual-arm robot is studied to approach the patient in a complex environment with deep reinforcement learning algorithms. The shape of the human body and bed is complex which may lead to the collision between the human and the robot. Because the sparse reward the robot obtains from the environment may not support the robot to accomplish the task, a neural network is trained to control the manipulators of the robot to prepare to hold the patient up by using a proximal policy optimization algorithm with a continuous reward function. Firstly, considering the realistic scene, the 3D simulation environment is built to conduct the research. Secondly, inspired by the idea of the artificial potential field, a new reward and punishment function was proposed to help the robot obtain enough rewards to explore the environment. The function is consisting of four parts which include the reward guidance function, collision detection, obstacle avoidance function, and time function. Where the reward guidance function is used to guide the robot to approach the targets to hold the patient, the collision detection and obstacle avoidance function are complementary to each other and are used to avoid obstacles, and the time function is used to reduce the number of training episode. Finally, after the robot is trained to reach the targets, the training results are analyzed. Compared with the DDPG algorithm, the PPO algorithm reduces about 4 million steps for training to converge. Moreover, compared with the other reward and punishment functions, the function used in this paper will obtain many more rewards at the same training time. Apart from that, it will take much less time to converge, and the episode length will be shorter; so, the advantage of the algorithm used in this paper is verified.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
知夏完成签到,获得积分10
7秒前
无私元芹发布了新的文献求助10
9秒前
无私元芹完成签到,获得积分10
26秒前
王晓宇完成签到,获得积分10
51秒前
好好学习发布了新的文献求助30
1分钟前
思源应助叽叽采纳,获得10
1分钟前
1分钟前
靓丽的访曼完成签到,获得积分20
2分钟前
2分钟前
田様应助朴素的山蝶采纳,获得30
2分钟前
2分钟前
叽叽发布了新的文献求助10
2分钟前
2分钟前
fanssw完成签到 ,获得积分10
2分钟前
2分钟前
2分钟前
2分钟前
2分钟前
钉钉完成签到 ,获得积分10
2分钟前
wuwr3发布了新的文献求助10
2分钟前
3分钟前
Ava应助ClaudeLvan采纳,获得10
3分钟前
jyy应助科研通管家采纳,获得10
3分钟前
puzhongjiMiQ发布了新的文献求助10
3分钟前
3分钟前
gincle完成签到 ,获得积分10
3分钟前
puzhongjiMiQ完成签到,获得积分10
3分钟前
4分钟前
ClaudeLvan发布了新的文献求助10
4分钟前
4分钟前
5分钟前
5分钟前
5分钟前
汉堡包应助科研通管家采纳,获得10
5分钟前
SciGPT应助ClaudeLvan采纳,获得10
5分钟前
5分钟前
5分钟前
去码头整点薯条完成签到,获得积分10
5分钟前
5分钟前
5分钟前
高分求助中
Thinking Small and Large 500
Algorithmic Mathematics in Machine Learning 500
Mapping the Stars: Celebrity, Metonymy, and the Networked Politics of Identity 400
Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300
Engineering the boosting of the magnetic Purcell factor with a composite structure based on nanodisk and ring resonators 240
Cleaning Technology in Semiconductor Device Manufacturing: Proceedings of the Sixth International Symposium (Advances in Soil Science) 200
Study of enhancing employee engagement at workplace by adopting internet of things 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3837436
求助须知:如何正确求助?哪些是违规求助? 3379588
关于积分的说明 10509896
捐赠科研通 3099198
什么是DOI,文献DOI怎么找? 1706976
邀请新用户注册赠送积分活动 821348
科研通“疑难数据库(出版商)”最低求助积分说明 772552