水准点(测量)
标杆管理
生成语法
背景(考古学)
计算机科学
集合(抽象数据类型)
相似性(几何)
构造(python库)
人工智能
机器学习
生成模型
计算生物学
生物
程序设计语言
古生物学
大地测量学
营销
业务
图像(数学)
地理
作者
Gaoqi Weng,Huifeng Zhao,Dou Nie,Haotian Zhang,Liwei Liu,Tingjun Hou,Yu Kang
标识
DOI:10.1021/acs.jmedchem.3c02051
摘要
Deep learning-based molecular generative models have garnered emerging attention for their capability to generate molecules with novel structures and desired physicochemical properties. However, the evaluation of these models, particularly in a biological context, remains insufficient. To address the limitations of existing metrics and emulate practical application scenarios, we construct the RediscMol benchmark that comprises active molecules extracted from 5 kinase and 3 GPCR data sets. A set of rediscovery- and similarity-related metrics are introduced to assess the performance of 8 representative generative models (CharRNN, VAE, Reinvent, AAE, ORGAN, RNNAttn, TransVAE, and GraphAF). Our findings based on the RediscMol benchmark differ from those of previous evaluations. CharRNN, VAE, and Reinvent exhibit a greater ability to reproduce known active molecules, while RNNAttn, TransVAE, and GraphAF struggle in this aspect despite their notable performance on commonly used distribution-learning metrics. Our evaluation framework may provide valuable guidance for advancing generative models in real-world drug design scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI