计算机科学
Softmax函数
判决
图形
人工智能
自然语言处理
关系抽取
信息抽取
深度学习
理论计算机科学
作者
Lei Qin,Gaocai Dong,Jing Peng
标识
DOI:10.1109/bibm49941.2020.9313234
摘要
Automated recognition of functional interactions between compounds and proteins/genes from biomedical literature is essential for drug discovery, knowledge understanding, and basic clinical research. Although several computational methods have achieved competitive performances in extracting these relations, there is significant room for improvement in fully capturing complex semantic and syntactic information within sentences. We herein present a novel parallel model to improve chemical-protein interaction (CPI) extraction. Specifically, the model consists of ChemicalBERT and Attention Guided Graph Convolutional Networks (AGGCN) two parallel components. We pre-train BERT on large-scale chemical interaction corpora and re-define it as ChemicalBERT to generate high-quality contextual representation, and employ AGGCN to capture syntactic graph information of the sentence. Finally, the contextual representation and syntactic graph representation are merged into a fusion layer and then fed into the fully-connected softmax layer to extract CPIs. We evaluate our proposed model on the ChemProt corpus, which is the benchmark corpus of this domain. We achieve state-of-the-art results for the CPI extraction with a micro-averaged F1-score of 80.21%. To further demonstrate the efficacy of the proposed model, we have also conducted experiments on the DDIExtraction 2013 corpus and obtained a micro-averaged F1-score of 82.88%, which is also the highest score compared to the existing models. Experimental results show that our proposed model can adequately capture semantic and syntactic information by parallelly extracting sentence features from different views. The code is available at https://github.com/ql-bio/CPR extraction.
科研通智能强力驱动
Strongly Powered by AbleSci AI