生物信息学
计算机科学
卷积神经网络
可转让性
深度学习
计算生物学
机器学习
人工智能
源代码
蛋白质测序
领域(数学分析)
肽序列
生物
基因
生物化学
程序设计语言
数学分析
罗伊特
数学
作者
Xiaoyong Pan,Jasper Zuallaert,Xi Wang,Hong‐Bin Shen,Elda Posada Campos,Denys Marushchak,Wesley De Neve
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2020-07-15
卷期号:36 (21): 5159-5168
被引量:50
标识
DOI:10.1093/bioinformatics/btaa656
摘要
Abstract Motivation Genetically engineering food crops involves introducing proteins from other species into crop plant species or modifying already existing proteins with gene editing techniques. In addition, newly synthesized proteins can be used as therapeutic protein drugs against diseases. For both research and safety regulation purposes, being able to assess the potential toxicity of newly introduced/synthesized proteins is of high importance. Results In this study, we present ToxDL, a deep learning-based approach for in silico prediction of protein toxicity from sequence alone. ToxDL consists of (i) a module encompassing a convolutional neural network that has been designed to handle variable-length input sequences, (ii) a domain2vec module for generating protein domain embeddings and (iii) an output module that classifies proteins as toxic or non-toxic, using the outputs of the two aforementioned modules. Independent test results obtained for animal proteins and cross-species transferability results obtained for bacteria proteins indicate that ToxDL outperforms traditional homology-based approaches and state-of-the-art machine-learning techniques. Furthermore, through visualizations based on saliency maps, we are able to verify that the proposed network learns known toxic motifs. Moreover, the saliency maps allow for directed in silico modification of a sequence, thus making it possible to alter its predicted protein toxicity. Availability and implementation ToxDL is freely available at http://www.csbio.sjtu.edu.cn/bioinf/ToxDL/. The source code can be found at https://github.com/xypan1232/ToxDL. Supplementary information Supplementary data are available at Bioinformatics online.
科研通智能强力驱动
Strongly Powered by AbleSci AI