Sharpless不对称二羟基化
二羟基化
计算机科学
数据挖掘
化学
对映选择合成
计算生物学
人工智能
有机化学
生物
催化作用
作者
Blake E. Ocampo,Bilal Altundas,M.J. Bock,Sara Feiz,Scott E. Denmark
标识
DOI:10.26434/chemrxiv-2025-zp7rn
摘要
The Sharpless asymmetric dihydroxylation remains a key transformation in chemical synthesis, yet its success hides unexpected cases of lower selectivity. A chemoinformatic workflow was developed to allow data-driven analysis of the reaction. A database of 1007 reactions employing AD-mix α and β was curated from the literature, and an alignment-dependent, fragment-based featurization of alkenes was implemented for modeling. This platform converged on machine learning models capable of predicting the magnitude of enantioselectivity for multiple alkene classes, achieving Q2F3 values ≥ 0.8, test r2 values ≥ 0.7 and mean absolute errors (MAE) ≤ 0.3 kcal/mol. The features of alkenes contributing to model performance were assessed with SHapley Additive exPlanations (SHAP) analysis to gather insight into factors underlying predictions. Experimental validation demonstrated that the models could achieve meaningful predictions on numerous out-of-sample alkenes.
科研通智能强力驱动
Strongly Powered by AbleSci AI