操作性条件作用
惩罚(心理学)
钢筋
条件作用
强化学习
生物神经网络
人工神经网络
心理学
过程(计算)
计算机科学
神经科学
人工智能
发展心理学
社会心理学
数学
操作系统
统计
作者
Bei Chen,Fazhan Liu,Herbert Ho‐Ching Iu,Han Bao,Quan Xu
出处
期刊:IEEE Transactions on Circuits and Systems Ii-express Briefs
[Institute of Electrical and Electronics Engineers]
日期:2023-10-05
卷期号:71 (3): 1002-1006
被引量:3
标识
DOI:10.1109/tcsii.2023.3322168
摘要
Operant conditioning is an essential learning mechanism for organisms and a fundamental theory for reinforcement learning in artificial intelligence. This paper proposes a neural network circuit based on non-volatile memristors that mimics the process of operant conditioning, such as the effects of reinforcement (positive reward or negative punishment) on the acquisition and maintenance of certain behaviors. This circuit is composed of two components: a reward operant conditioning circuit and a punishment operant conditioning circuit. These reward and punishment operant conditioning circuits not only simulate the process of exploration, acquisition, and satiety, but also reveal the effect of reward delay and punishment intensity on the acquisition of operant conditioning. This research holds the potential for practical application in training robots to make decisions. By adjusting reward delay and punishment intensity, the learning speed and effectiveness of robots can be enhanced.
科研通智能强力驱动
Strongly Powered by AbleSci AI