自由序列分析
相似性(几何)
序列(生物学)
成对比较
序列比对
洗牌
史密斯-沃特曼算法
序列数据库
多序列比对
计算机科学
计算生物学
数据挖掘
DNA测序
蛋白质测序
生物信息学
遗传学
DNA
人工智能
生物
肽序列
基因
图像(数学)
程序设计语言
作者
William R. Pearson,David J. Lipman
标识
DOI:10.1073/pnas.85.8.2444
摘要
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.
科研通智能强力驱动
Strongly Powered by AbleSci AI