机器学习
计算机科学
核(代数)
人工智能
化学空间
维数之咒
代表(政治)
财产(哲学)
降维
化学
数学
组合数学
生物化学
哲学
认识论
政治
政治学
法学
药物发现
作者
Clemens Rauer,Tristan Bereau
摘要
We consider the prediction of a basic thermodynamic property-hydration free energies-across a large subset of the chemical space of small organic molecules. Our in silico study is based on computer simulations at the atomistic level with implicit solvent. We report on a kernel-based machine learning approach that is inspired by recent work in learning electronic properties but differs in key aspects: The representation is averaged over several conformers to account for the statistical ensemble. We also include an atomic-decomposition ansatz, which offers significant added transferability compared to molecular learning. Finally, we explore the existence of severe biases from databases of experimental compounds. By performing a combination of dimensionality reduction and cross-learning models, we show that the rate of learning depends significantly on the breadth and variety of the training dataset. Our study highlights the dangers of fitting machine-learning models to databases of a narrow chemical range.
科研通智能强力驱动
Strongly Powered by AbleSci AI