强化学习
计算机科学
人工智能
价值(数学)
领域(数学)
钢筋
机器学习
数学
心理学
社会心理学
纯数学
标识
DOI:10.1007/978-981-19-0638-1_3
摘要
The previous chapter introduced the field of classic reinforcement learning. We learned about agents and environments, and about states, actions, values, and policy functions. We also saw our first planning and learning algorithms: value iteration, SARSA, and Q-learning. The methods in the previous chapter were exact, tabular, methods, which work for problems of moderate size that fit in memory.
科研通智能强力驱动
Strongly Powered by AbleSci AI