计算机科学
回顾性分析
公制(单位)
成对比较
前提
合成数据
光学(聚焦)
优先次序
约束(计算机辅助设计)
人工智能
理论计算机科学
机器学习
化学
数学
管理科学
工程类
语言学
全合成
运营管理
哲学
物理
有机化学
光学
几何学
作者
Connor W. Coley,Luke Rogers,William H. Green,Klavs F. Jensen
标识
DOI:10.1021/acs.jcim.7b00622
摘要
Several definitions of molecular complexity exist to facilitate prioritization of lead compounds, to identify diversity-inducing and complexifying reactions, and to guide retrosynthetic searches. In this work, we focus on synthetic complexity and reformalize its definition to correlate with the expected number of reaction steps required to produce a target molecule, with implicit knowledge about what compounds are reasonable starting materials. We train a neural network model on 12 million reactions from the Reaxys database to impose a pairwise inequality constraint enforcing the premise of this definition: that on average, the products of published chemical reactions should be more synthetically complex than their corresponding reactants. The learned metric (SCScore) exhibits highly desirable nonlinear behavior, particularly in recognizing increases in synthetic complexity throughout a number of linear synthetic routes.
科研通智能强力驱动
Strongly Powered by AbleSci AI