Resolution of Phylogenetic Conflict in Large Data Sets by Increased Taxon Sampling

生物 分类单元 系统发育树 采样(信号处理) 进化生物学 统计 古生物学 数学 遗传学 计算机科学 滤波器(信号处理) 计算机视觉 基因
作者
Shannon M. Hedtke,Ted M. Townsend,David M. Hillis
出处
期刊:Systematic Biology [Oxford University Press]
卷期号:55 (3): 522-529 被引量:302
标识
DOI:10.1080/10635150600697358
摘要

The debate about whether phylogenetic accuracy is most efficiently increased by sampling more charac ters or more taxa is certainly not new (e.g., Kim, 1996; Graybeal, 1998; Poe, 1998a,b; Rannala et al., 1998; Poe and Swofford, 1999; Pollock and Bruno, 2000; Rosenburg and Kumar, 2001; Pollock et al., 2002; Zwickl and Hillis, 2002; Rosenberg and Kumar, 2003; Hillis et al., 2003). However, the recent increase of whole genomic sequences available from an assortment of distantly related taxa makes this debate highly relevant to researchers across fields of bi ology. Recently, Rokas et al. (2003) argued that the true species tree can be recovered despite conflicting phylo genetic signal between genes if enough genes are used in the analysis. Using the bootstrap proportion (BP) as a measure of phylogenetic accuracy, they concluded that approximately 20 genes are needed to ensure a robustly supported tree (>95% BP) for their study group of eight yeast taxa. From these empirical results, they generalized that most molecular phylogenetic studies have probably included insufficient numbers of genes to confidently re solve relationships within their respective focal groups. This approach to measuring accuracy can be sensitive to method inconsistency, or the failure to converge on the correct tree as the data set becomes infinitely large. When a method is inconsistent, measures of support such as nonparametric bootstrapping can increase as more se quence data are added?but in support of the wrong phy logeny (Phillips et al., 2004; Collins et al., 2005; Delsuc et al., 2005). Although most methods perform well over most of tree space (Huelsenbeck, 1995; Poe, 2003), regions of inconsistency have been identified in the literature for all of the most commonly used phylogenetic meth ods. For example, compositional bias can affect the accu racy of minimum evolution (Phillips et al., 2004), model misspecification may affect parametric methods such as maximum likelihood (ML) (Poe, 2003; Philippe et al, 2005; Collins et al., 2005), and branch-length asymme try can lead to inconsistency in maximum parsimony (Felsenstein, 1978; Hendy and Penny, 1989). Parsimony is particularly prone to long-branch attraction (LBA), an analytical artifact in which two taxa on long branches are incorrectly placed as sister taxa (Felsenstein, 1978; Hendy and Penny, 1989; Huelsenbeck and Hillis, 1993). Although there are many reasons for conflicting phylo genetic signal between genes, one relevant reason could be related to method inconsistency: differing rates of evo lution between genes could cause a particular method to be inconsistent for some genes and not for others. We argue that by addressing this source of conflict between genes, fewer genes may be needed to return an accu rate phylogeny. One source of conflict in the Rokas et al. (2003) data set may be nonstationarity: taxa that differ from the others in their base compositional bias may be erroneously drawn together as sister taxa (Collins et al., 2005). Here, we show that an additional source of conflict between the 106 genes in the Rokas et al. data set may be branch-length asymmetry. Using simulations of 106 genes from the Rokas et al. data set on a 79-taxon yeast phylogeny, we additionally show that when genes are added to a data set, support for the wrong reconstruc tion can increase when there is LBA. However, when taxa are added to the analysis, support for the correct reconstruction increases, and fewer genes are needed to achieve accuracy.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
小蘑菇应助4Xchua采纳,获得10
2秒前
因为我从来是那样完成签到,获得积分10
2秒前
2秒前
上官若男应助沉默的婴采纳,获得10
3秒前
3秒前
Common完成签到,获得积分10
3秒前
4秒前
烂漫烧鹅完成签到,获得积分10
4秒前
慕青应助永远永远采纳,获得10
4秒前
瀚泛完成签到,获得积分10
5秒前
打卡下班应助沐沐子采纳,获得10
5秒前
5秒前
王凯文完成签到,获得积分10
5秒前
欣欣紫发布了新的文献求助30
6秒前
雷乾完成签到,获得积分10
6秒前
cdercder应助背后的金针菇采纳,获得10
6秒前
海豚发布了新的文献求助10
6秒前
Lucas应助夏冰雹采纳,获得10
7秒前
TT完成签到,获得积分10
7秒前
欣喜书易完成签到 ,获得积分10
7秒前
CatC发布了新的文献求助10
8秒前
我的miemie完成签到,获得积分10
8秒前
无心的秋珊完成签到 ,获得积分10
8秒前
8秒前
丘比特应助liuzengzhang666采纳,获得10
8秒前
8秒前
www发布了新的文献求助20
9秒前
幸运星完成签到 ,获得积分10
9秒前
9秒前
oneko发布了新的文献求助10
10秒前
SW完成签到,获得积分10
10秒前
FashionBoy应助一个小胖子采纳,获得10
11秒前
隐形曼青应助顺利紫山采纳,获得10
11秒前
惊鸿客完成签到,获得积分10
11秒前
11秒前
12秒前
CAOHOU举报平常的雁凡求助涉嫌违规
12秒前
13秒前
13秒前
高分求助中
【重要!!请各位用户详细阅读此贴】科研通的精品贴汇总(请勿应助) 10000
International Code of Nomenclature for algae, fungi, and plants (Madrid Code) (Regnum Vegetabile) 1000
Robot-supported joining of reinforcement textiles with one-sided sewing heads 530
Apiaceae Himalayenses. 2 500
Beyond The Sentence: Discourse And Sentential Form 500
Maritime Applications of Prolonged Casualty Care: Drowning and Hypothermia on an Amphibious Warship 500
Overcoming Synthetic Challenges in Medicinal Chemistry Mechanistic Insights and Solutions 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4075735
求助须知:如何正确求助?哪些是违规求助? 3614768
关于积分的说明 11472959
捐赠科研通 3332677
什么是DOI,文献DOI怎么找? 1831832
邀请新用户注册赠送积分活动 901654
科研通“疑难数据库(出版商)”最低求助积分说明 820495