强化学习
计算机科学
弦(物理)
理论(学习稳定性)
人工智能
机器学习
数学
数学物理
作者
Hampus Gummesson Svensson,Christian Tyrchan,Ola Engkvist,Morteza Haghir Chehreghani
出处
期刊:Cornell University - arXiv
日期:2023-03-30
被引量:1
标识
DOI:10.48550/arxiv.2303.17615
摘要
Deep learning-based approaches for generating novel drug molecules with specific properties have gained a lot of interest in the last few years. Recent studies have demonstrated promising performance for string-based generation of novel molecules utilizing reinforcement learning. In this paper, we develop a unified framework for using reinforcement learning for de novo drug design, wherein we systematically study various on- and off-policy reinforcement learning algorithms and replay buffers to learn an RNN-based policy to generate novel molecules predicted to be active against the dopamine receptor DRD2. Our findings suggest that it is advantageous to use at least both top-scoring and low-scoring molecules for updating the policy when structural diversity is essential. Using all generated molecules at an iteration seems to enhance performance stability for on-policy algorithms. In addition, when replaying high, intermediate, and low-scoring molecules, off-policy algorithms display the potential of improving the structural diversity and number of active molecules generated, but possibly at the cost of a longer exploration phase. Our work provides an open-source framework enabling researchers to investigate various reinforcement learning methods for de novo drug design.
科研通智能强力驱动
Strongly Powered by AbleSci AI