好奇心
强化学习
计算机科学
新颖性
人工智能
理论(学习稳定性)
机器学习
心理学
神学
社会心理学
哲学
作者
Jing Li,Xinxin Shi,Jiehao Li,Xin Zhang,Junzheng Wang
标识
DOI:10.1016/j.neucom.2020.08.024
摘要
Reinforcement learning (RL) depends on carefully engineering environment rewards. However, rewards from environments are extremely sparse for many RL tasks, challenging for the agent to learn skills and interact with the environment. One solution to this problem is to create intrinsic rewards for agents and to make rewards dense and more suitable for learning. Recent algorithms, such as curiosity-driven exploration, usually estimate the novelty of the next state through the prediction error of dynamics models. However, these methods are typically limited by the capacity of their dynamics models. In this paper, a random curiosity-driven model using deep reinforcement learning is proposed, which uses a target network with fixed weights to maintain the stability of dynamics models and create more suitable intrinsic rewards. We integrate the parametric exploration method for further promoting sufficient exploration. Besides, a deeper and more closely connected network is utilized for encoding the pixel images for policy-gradient. By comparing our method against the previous approaches in several environments, the experiments show that our method achieves state-of-the-art performance on most but not all of the Atari games.
科研通智能强力驱动
Strongly Powered by AbleSci AI