计算机科学
图形
集成学习
集合(抽象数据类型)
骨料(复合)
机器学习
人工智能
集合预报
数据挖掘
训练集
数据集
人工神经网络
回归
方案(数学)
模式识别(心理学)
分子描述符
元建模
分子图
多种型号
实验数据
作者
M. L. Parker,Samar Mahmoud,Bailey C. Montefiore,Mario Öeren,Himani Tandon,Charlotte E. Wharrick,Matthew Segall
标识
DOI:10.1021/acs.jcim.5c01844
摘要
We explore a "best-of-both" approach to modeling molecular properties by combining learned molecular descriptors from a graph neural network (GNN) with general-purpose descriptors and a mixed ensemble of machine learning (ML) models. We introduce a MetaModel framework to aggregate predictions from a diverse set of leading ML models. We present a featurization scheme for combining task-specific GNN-derived features with conventional molecular descriptors. We demonstrate that our framework outperforms the cutting-edge ChemProp model on all regression data sets tested and 6 of 9 classification data sets. We further show that including the GNN features derived from ChemProp boosts the ensemble model's performance on several data sets where it otherwise would have underperformed. We conclude that to achieve optimal performance across a wide set of problems, it is vital to combine general-purpose descriptors with task-specific learned features and to use a diverse set of ML models to make the predictions.
科研通智能强力驱动
Strongly Powered by AbleSci AI