化学信息学
计算机科学
化学空间
杠杆(统计)
分子图
人工智能
机器学习
图形
下部结构
财产(哲学)
理论计算机科学
自然语言处理
化学
药物发现
计算化学
工程类
哲学
认识论
结构工程
生物化学
作者
Evgeniia E. Ondar,Ilya Makarov
标识
DOI:10.1007/978-3-031-43078-7_27
摘要
In molecular property prediction tasks, graph neural networks have become a widely used tool. Recently, self-supervised learning frameworks, especially contrastive learning, gathered growing attention for the potential to learn molecular representations that generalize to the meaningful chemical space. Unlike supervised, self-supervised learning can directly leverage extensive unlabeled data, which significantly reduces the effort to acquire molecular property labels through costly and time-consuming simulations or experiments. However, most of them do not take into account the unique cheminformatics (e.g., molecular fingerprints) and multi-level molecular graph structures (e.g., functional groups). In toxicity prediction tasks the molecule substructure can be crucial. Structure alerts (e.g. toxicophores) are studied pretty well and proven to be responsible for different types of toxicity. In this work, we propose chemistry-wise augmentations for a contrastive learning framework. Two augmentations were implemented: (1) toxicophore subgraph removal, and (2) toxicophore subgraph saving. This approach does not violate chemical principles while pushing the model to learn the toxicity-dependent parts of a molecule. Experiments showed that novel augmentations are more efficient than the random subgraph masking approach usually used in molecular contrastive learning. The performance comparison with other GNN-based frameworks is carried out as well.
科研通智能强力驱动
Strongly Powered by AbleSci AI