计算机科学
瓶颈
序列(生物学)
启发式
蛋白质测序
计算生物学
人工智能
机器学习
生物信息学
生物
肽序列
生物化学
遗传学
基因
嵌入式系统
操作系统
作者
Yu‐Yen Ou,Quang‐Thai Ho,Heng‐Ta Chang
出处
期刊:Proteomics
[Wiley]
日期:2023-10-20
卷期号:23 (23-24)
被引量:2
标识
DOI:10.1002/pmic.202200494
摘要
Abstract Membrane proteins play a crucial role in various cellular processes and are essential components of cell membranes. Computational methods have emerged as a powerful tool for studying membrane proteins due to their complex structures and properties that make them difficult to analyze experimentally. Traditional features for protein sequence analysis based on amino acid types, composition, and pair composition have limitations in capturing higher‐order sequence patterns. Recently, multiple sequence alignment (MSA) and pre‐trained language models (PLMs) have been used to generate features from protein sequences. However, the significant computational resources required for MSA‐based features generation can be a major bottleneck for many applications. Several methods and tools have been developed to accelerate the generation of MSAs and reduce their computational cost, including heuristics and approximate algorithms. Additionally, the use of PLMs such as BERT has shown great potential in generating informative embeddings for protein sequence analysis. In this review, we provide an overview of traditional and more recent methods for generating features from protein sequences, with a particular focus on MSAs and PLMs. We highlight the advantages and limitations of these approaches and discuss the methods and tools developed to address the computational challenges associated with features generation. Overall, the advancements in computational methods and tools provide a promising avenue for gaining deeper insights into the function and properties of membrane proteins, which can have significant implications in drug discovery and personalized medicine.
科研通智能强力驱动
Strongly Powered by AbleSci AI