先验与后验
计算机科学
人工智能
化学空间
机器学习
适用范围
领域(数学分析)
班级(哲学)
光学(聚焦)
基线(sea)
空格(标点符号)
数据挖掘
合成数据
化学过程
可靠性(半导体)
领域知识
多元微积分
实验数据
作者
Zhenzhi Tan,Qi Yang,Li Zhang,Sanzhong Luo
标识
DOI:10.1002/anie.202523874
摘要
Abstract Organic synthetic chemistry has undergone a paradigm shift driven by breakthroughs in artificial intelligence (AI). Data‐driven methods help accelerate hypothesis evaluation and reduce experimental trial‐and‐error efforts. However, its practical utility is constrained by the out‐of‐distribution (OOD) issue, where predictions usually fail when extrapolating to unseen reactions with new catalysts, substrates, or conditions. Here, we introduce SynAD (synthetic applicability domain), a machine learning framework for assessing the predictive capability of AI models trained with existing data. SynAD combines descriptors with model‐adaptive distance metrics to automatically demarcate reliable and unreliable reactions. Validated on the Ullmann Ligand Dataset (ULD, >5000 reactions), SynAD a priori distinguishes predictable chemical space, resulting in a prediction accuracy of R 2 = 0.90 (at 12.3% coverage) from a baseline of R 2 = −0.21. This capacity to target reliable chemical space is consistently observed across 6 additional datasets. We also enable a SynAD score to quantify reaction class predictability, guiding experimental focus on OOD spaces. By defining model limits, SynAD provides a critical guardrail for chemists to trust AI, allocate resources strategically, and accelerate de novo discovery.
科研通智能强力驱动
Strongly Powered by AbleSci AI