计算机科学
刮擦
强化学习
人工智能
稳健性(进化)
规范化(社会学)
机器学习
控制(管理)
程序设计语言
人类学
生物化学
基因
社会学
化学
作者
Danijar Hafner,Jurgis Pašukonis,Jimmy Ba,Timothy Lillicrap
出处
期刊:Nature
[Springer Nature]
日期:2025-04-02
卷期号:640 (8059): 647-653
被引量:30
标识
DOI:10.1038/s41586-025-08744-2
摘要
Abstract Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement-learning algorithms can be readily applied to tasks similar to what they have been developed for, configuring them for new application domains requires substantial human expertise and experimentation 1,2 . Here we present the third generation of Dreamer, a general algorithm that outperforms specialized methods across over 150 diverse tasks, with a single configuration. Dreamer learns a model of the environment and improves its behaviour by imagining future scenarios. Robustness techniques based on normalization, balancing and transformations enable stable learning across domains. Applied out of the box, Dreamer is, to our knowledge, the first algorithm to collect diamonds in Minecraft from scratch without human data or curricula. This achievement has been posed as a substantial challenge in artificial intelligence that requires exploring farsighted strategies from pixels and sparse rewards in an open world 3 . Our work allows solving challenging control problems without extensive experimentation, making reinforcement learning broadly applicable.
科研通智能强力驱动
Strongly Powered by AbleSci AI