亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

TCS: a computer program to estimate gene genealogies

生物 基因 计算机程序 计算生物学 进化生物学 遗传学 程序设计语言 计算机科学
作者
Mark J. Clement,David Posada,Keith A. Crandall
出处
期刊:Molecular Ecology [Wiley]
卷期号:9 (10): 1657-1659 被引量:8816
标识
DOI:10.1046/j.1365-294x.2000.01020.x
摘要

Phylogenies are extremely useful tools, not only for establishing genealogical relationships among a group of organisms or their parts (e.g. genes), but also for a variety of research once the phylogenies are estimated. In a recent review, Pagel (1999) eloquently outline a number of uses for phylogenetic information from discovery of drug resistance to reconstructing the common ancestor to all of life. Phylogenies have been used to predict future trends in infectious disease ( Bush et al. 1999 ) and have even been offered as evidence in a court of law ( Vogel 1997). Yet phylogenies are only as useful as they are accurate. Estimating genealogical relationships among genes at the population level presents a number of difficulties to traditional methods of phylogeny reconstruction. These traditional methods such as parsimony, neighbour-joining, and maximum-likelihood make assumptions that are invalid at the population level. For example, these methods assume ancestral haplotypes are no longer in the population, yet coalescent theory predicts that ancestral haplotypes will be the most frequent sequences sampled in a population level study ( Watterson & Guess 1977; Donnelly & Tavaré 1986; Crandall & Templeton 1993). Traditional methods require reasonably large numbers of variable characters to accurately reconstruct relationships ( Huelsenbeck & Hillis 1993) and population level studies typically lack such variation. Also, recombination is a real possibility among sequences at the population level and traditional methods assume recombination does not occur. The failure to incorporate the possibility of recombination in phylogeny reconstruction can lead to grave errors in the resulting estimated phylogeny. The combination of these effects can lead parsimony methods to infer a cumbersome amount of most parsimonious trees at the population level with no resolution among the set (e.g. over one billion trees for a set of human mitochondrial DNA (mtDNA), Excoffier & Smouse 1994). These effects can also lead neighbour-joining and traditional maximum-likelihood methods to be over confident in the resulting relationships ( Bandelt et al. 1995 ). Therefore, an alternative approach is needed to provide accurate estimates of gene genealogies at the population level that take into account these population level phenomena not addressed by traditional methods. Multiple groups have looked to network representations for population level genealogical information ( Bandelt & Dress 1992; Templeton et al. 1992 ; Excoffier & Smouse 1994; Fitch 1997). Networks allow one to naturally incorporate the often-times nonbifurcating genealogical information associated with population level divergences. The method of Templeton et al. (1992) (TCS) has been used extensively with restriction site and nucleotide sequence data to infer population level genealogies when divergences are low ( Georgiadis et al. 1994 ; Routman et al. 1994 ; Gerber & Templeton 1996; Hedin 1997; Schaal et al. 1998 ; Viláet al. 1999 , Gómez-Zurita et al. 2000). TCS has been used with traditional methods to estimate relationships among organisms that span a wide range of divergence ( Crandall & Fitzpatrick 1996; Benabib et al. 1997 ). The approach has also been used extensively with a nested analysis procedure to partition population structure from population history ( Templeton et al. 1995 ; Templeton 1998) and explore the phylogeographic history of a diversity of organisms (e.g. Johnson & Jordon 2000; Turner et al. 2000 ). In this note, we announce the availability of a new software package, TCS, to estimate genealogical relationships among sequences using the method of Templeton et al. (1992) . The TCS software opens nucleotide sequence files in either nexus ( Maddison et al. 1997 ) or phylip ( Felsenstein 1991) sequential format. Sequences should not be collapsed into haplotypes as frequency data can be incorporated into the output. The program collapses sequences into haplotypes and calculates the frequencies of the haplotypes in the sample. These frequencies are used to estimate haplotype outgroup probabilities, which correlate with haplotype age ( Donnelly & Tavaré 1986; Castelloe & Templeton 1994). An absolute distance matrix is then calculated for all pairwise comparisons of haplotypes. The probability of parsimony [as defined in Templeton et al. (1992) , equations 6, 7, and 8] is calculated for pairwise differences until the probability exceeds 0.95. The number of mutational differences associated with the probability just before this 95% cut-off is then the maximum number of mutational connections between pairs of sequences justified by the ‘parsimony’ criterion. These justified connections are then made resulting in a 95% set of plausible solutions. The program outputs the sequences, the pairwise absolute distance matrix, probabilities of parsimony for mutational steps just beyond the 95% cut-off, a test listing of connections made and missing intermediates generated, and a graph output file containing the resulting network ( Fig. 1). This graph output file can be opened in the freeware VGJ 1.0.3 ( http://www.eng.auburn.edu/department/cse/research/graphdrawing/graphdrawing.html; distributed under the terms of the GNU General Public License, Version 2), which is packaged with the TCS algorithm. The program can handle a reasonable number of sequences. For example, an HTLV data set with 69 haplotypes of length 725 bp took over one hour to run in a Macintosh G3. Memory requirements are low, and the program will run with less than 1 MB RAM. The TCS software package, including executables for Mac and PC, documentation, and Java source code, is distributed freely and is available at our website, along with a host of other programs for population genetic and phylogenetic analyses: http://bioag.byu.edu/zoology/crandalllab/programs.htm. TCS Java interface. The maximum number of steps connecting parsimoniously two haplotypes is indicated. Gaps can be treated as a 5th state or as missing data. The graph can be edited and arranged using different algorithms. By double-clicking over a haplotype, some information is displayed, such as sequences included in the haplotype and outgroup weights. The haplotype with the highest outgroup probability is displayed as a square, while other haplotypes are displayed as ovals. The size of the square or oval corresponds to the haplotype frequency. This work was supported by the Alfred P. Sloan Foundation, a Shannon Award from the National Institutes of Health, and NIH R01-HD34350.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Fine完成签到,获得积分10
15秒前
17秒前
余念安完成签到 ,获得积分10
17秒前
Fine发布了新的文献求助10
21秒前
聪明勇敢有力气完成签到 ,获得积分10
28秒前
30秒前
37秒前
43秒前
斯文败类应助科研通管家采纳,获得10
47秒前
KYT完成签到 ,获得积分10
53秒前
可夫司机完成签到 ,获得积分10
59秒前
1分钟前
DrHuang完成签到,获得积分10
1分钟前
西门晴发布了新的文献求助10
1分钟前
1分钟前
西门晴完成签到,获得积分10
1分钟前
1分钟前
1分钟前
半糖可乐完成签到,获得积分10
1分钟前
碧蓝可仁完成签到 ,获得积分10
1分钟前
1分钟前
Yuelong完成签到,获得积分10
2分钟前
2分钟前
2分钟前
Yuelong发布了新的文献求助50
2分钟前
2分钟前
yvette完成签到,获得积分10
2分钟前
科研通AI5应助咸金城采纳,获得20
2分钟前
3分钟前
3分钟前
3分钟前
咸金城发布了新的文献求助20
3分钟前
所所应助yvette采纳,获得10
3分钟前
上官若男应助美好采纳,获得10
3分钟前
今天发CNS了嘛完成签到,获得积分10
3分钟前
无花果应助侯辰沾采纳,获得10
3分钟前
3分钟前
henxi完成签到,获得积分10
4分钟前
半只熊完成签到 ,获得积分10
4分钟前
4分钟前
高分求助中
Thinking Small and Large 500
Algorithmic Mathematics in Machine Learning 500
Mapping the Stars: Celebrity, Metonymy, and the Networked Politics of Identity 400
Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300
Engineering the boosting of the magnetic Purcell factor with a composite structure based on nanodisk and ring resonators 240
Cleaning Technology in Semiconductor Device Manufacturing: Proceedings of the Sixth International Symposium (Advances in Soil Science) 200
Study of enhancing employee engagement at workplace by adopting internet of things 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3837355
求助须知:如何正确求助?哪些是违规求助? 3379531
关于积分的说明 10509773
捐赠科研通 3099163
什么是DOI,文献DOI怎么找? 1706958
邀请新用户注册赠送积分活动 821348
科研通“疑难数据库(出版商)”最低求助积分说明 772552