透视图(图形)
化学空间
代表(政治)
特征(语言学)
空格(标点符号)
计算机科学
人工智能
桥(图论)
数据科学
化学信息学
机器学习
生化工程
化学
药物发现
工程类
计算化学
哲学
医学
内科学
操作系统
法学
政治学
政治
语言学
生物化学
作者
Yuheng Ding,Bo Qiang,Qixuan Chen,Yiqiao Liu,Liangren Zhang,Zhenming Liu
标识
DOI:10.1021/acs.jcim.4c00004
摘要
Chemical reactions serve as foundational building blocks for organic chemistry and drug design. In the era of large AI models, data-driven approaches have emerged to innovate the design of novel reactions, optimize existing ones for higher yields, and discover new pathways for synthesizing chemical structures comprehensively. To effectively address these challenges with machine learning models, it is imperative to derive robust and informative representations or engage in feature engineering using extensive data sets of reactions. This work aims to provide a comprehensive review of established reaction featurization approaches, offering insights into the selection of representations and the design of features for a wide array of tasks. The advantages and limitations of employing SMILES, molecular fingerprints, molecular graphs, and physics-based properties are meticulously elaborated. Solutions to bridge the gap between different representations will also be critically evaluated. Additionally, we introduce a new frontier in chemical reaction pretraining, holding promise as an innovative yet unexplored avenue.
科研通智能强力驱动
Strongly Powered by AbleSci AI