强化学习
水准点(测量)
计算机科学
人工智能
增强学习
机器学习
差异进化
大地测量学
地理
作者
Iztok Fister,Dušan Fister
出处
期刊:Studies in computational intelligence
日期:2022-01-01
卷期号:: 43-75
标识
DOI:10.1007/978-981-16-8082-3_3
摘要
Reinforcement learning is a computational approach that mimics learning from interaction and supplements the existing supervised and unsupervised learning methods within the machine learning field. It bases on the mapping of a given situation to the action, and each action is evaluated by a reward. Of crucial concern, here is that the mapping is performed using suitable policies that correspond to a set of the so-called psychological stimulus-response rules (associations). However, in reinforcement learning, we are not interested in immediate rewards, but in a value function that specifies how good the rewards were in the long run. Reinforcement learning differential evolution is proposed in this study. On the one hand, a Q-learning algorithm capable of ensuring the good behavior of the evolutionary search process by explicit strategy exploration is engaged to collect the more prominent mutation strategies within an ensemble of strategies. On the other, the reinforcement learning mechanism selects among the strategies incorporated from the original L-SHADE algorithm using the ‘DE/current-to-pbest/1/bin’ mutation strategy toward the iL-SHADE to jSO using the ‘DE/current-to-pbest-w/1/bin’ mutation strategies. Testing the proposed RL-SHADE algorithm was conducted on the well-established function benchmark suites from the popular CEC special session/competition on real-parameter single-objective optimization during the last decade, where three different benchmark suites were issued. We expected that the results of the proposed RL-SHADE algorithm would outperform the results of the three original algorithms in solving all the observed benchmarks.
科研通智能强力驱动
Strongly Powered by AbleSci AI