强化学习
计算机科学
动态规划
钢筋
错误驱动学习
时差学习
自适应控制
自适应系统
适应性行为
最优控制
刺激(心理学)
人工智能
控制工程
控制(管理)
工程类
数学优化
数学
结构工程
精神科
心理学
心理治疗师
算法
作者
Frank L. Lewis,Draguna Vrabie
出处
期刊:IEEE Circuits and Systems Magazine
[Institute of Electrical and Electronics Engineers]
日期:2009-01-01
卷期号:9 (3): 32-50
被引量:1395
标识
DOI:10.1109/mcas.2009.933854
摘要
Living organisms learn by acting on their environment, observing the resulting reward stimulus, and adjusting their actions accordingly to improve the reward. This action-based or reinforcement learning can capture notions of optimal behavior occurring in natural systems. We describe mathematical formulations for reinforcement learning and a practical implementation method known as adaptive dynamic programming. These give us insight into the design of controllers for man-made engineered systems that both learn and exhibit optimal behavior.
科研通智能强力驱动
Strongly Powered by AbleSci AI