人工智能
一般化
机器学习
深度学习
计算机科学
标记数据
变压器
深层神经网络
训练集
人工神经网络
比例(比率)
任务(项目管理)
财产(哲学)
认识论
物理
数学分析
哲学
量子力学
经济
电压
管理
数学
作者
Sheng Wang,Yuzhi Guo,Yuhong Wang,Hongmao Sun,Junzhou Huang
出处
期刊:International Conference on Bioinformatics
日期:2019-09-04
卷期号:: 429-436
被引量:250
标识
DOI:10.1145/3307339.3342186
摘要
With the rapid progress of AI in both academia and industry, Deep Learning has been widely introduced into various areas in drug discovery to accelerate its pace and cut R&D costs. Among all the problems in drug discovery, molecular property prediction has been one of the most important problems. Unlike general Deep Learning applications, the scale of labeled data is limited in molecular property prediction. To better solve this problem, Deep Learning methods have started focusing on how to utilize tremendous unlabeled data to improve the prediction performance on small-scale labeled data. In this paper, we propose a semi-supervised model named SMILES-BERT, which consists of attention mechanism based Transformer Layer. A large-scale unlabeled data has been used to pre-train the model through a Masked SMILES Recovery task. Then the pre-trained model could easily be generalized into different molecular property prediction tasks via fine-tuning. In the experiments, the proposed SMILES-BERT outperforms the state-of-the-art methods on all three datasets, showing the effectiveness of our unsupervised pre-training and great generalization capability of the pre-trained model.
科研通智能强力驱动
Strongly Powered by AbleSci AI