磷酸化
人工智能
计算机科学
序列(生物学)
序列标记
激酶
作者
Lei Jiang,Duolin Wang,Dong Xu
出处
期刊:Methods in molecular biology
日期:2022-01-01
卷期号:: 105-124
标识
DOI:10.1007/978-1-0716-2317-6_4
摘要
AbstractPhosphorylation plays a vital role in signal transduction and cell cycle. Identifying and understanding phosphorylation through machine-learning methods has a long history. However, existing methods only learn representations of a protein sequence segment from a labeled dataset itself, which could result in biased or incomplete features, especially for kinase-specific phosphorylation site prediction in which training data are typically sparse. To learn a comprehensive contextual representation of a protein sequence segment for kinase-specific phosphorylation site prediction, we pretrained our model from over 24 million unlabeled sequence fragments using ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately). The pretrained model was applied to kinase-specific site prediction of kinases CDK, PKA, CK2, MAPK, and PKC. The pretrained ELECTRA model achieves 9.02% improvement over BERT and 11.10% improvement over MusiteDeep in the area under the precision-recall curve on the benchmark data.Key wordsDeep leaningPretrainingTransformerELECTRAPhosphorylation Kinase-specific phosphorylation site prediction
科研通智能强力驱动
Strongly Powered by AbleSci AI