Whole genome shotgun sequencing ofBrassica oleraceaand its application to gene discovery and annotation inArabidopsis

生物 霰弹枪测序 甘蓝 拟南芥 基因组 遗传学 基因 注释 计算生物学 猎枪 基因组计划 全基因组测序 植物 突变体
作者
Mulu Ayele,Brian J. Haas,Nikhil Kumar,Hank Wu,Yongli Xiao,Susan Van Aken,Teresa R. Utterback,Jennifer R. Wortman,Owen White,Christopher D. Town
出处
期刊:Genome Research [Cold Spring Harbor Laboratory Press]
卷期号:15 (4): 487-495 被引量:74
标识
DOI:10.1101/gr.3176505
摘要

Through comparative studies of the model organism Arabidopsis thaliana and its close relative Brassica oleracea , we have identified conserved regions that represent potentially functional sequences overlooked by previous Arabidopsis genome annotation methods. A total of 454,274 whole genome shotgun sequences covering 283 Mb (0.44×) of the estimated 650 Mb Brassica genome were searched against the Arabidopsis genome, and conserved Arabidopsis genome sequences (CAGSs) were identified. Of these 229,735 conserved regions, 167,357 fell within or intersected existing gene models, while 60,378 were located in previously unannotated regions. After removal of sequences matching known proteins, CAGSs that were close to one another were chained together as potentially comprising portions of the same functional unit. This resulted in 27,347 chains of which 15,686 were sufficiently distant from existing gene annotations to be considered a novel conserved unit. Of 192 conserved regions examined, 58 were found to be expressed in our cDNA populations. Rapid amplification of cDNA ends (RACE) was used to obtain potentially full-length transcripts from these 58 regions. The resulting sequences led to the creation of 21 gene models at 17 new Arabidopsis loci and the addition of splice variants or updates to another 19 gene structures. In addition, CAGSs overlapping already annotated genes in Arabidopsis can provide guidance for manual improvement of existing gene models. Published genome-wide expression data based on whole genome tiling arrays and massively parallel signature sequencing were overlaid on the Brassica – Arabidopsis conserved sequences, and 1399 regions of intersection were identified. Collectively our results and these data sets suggest that several thousand new Arabidopsis genes remain to be identified and annotated.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
1秒前
黑钻完成签到,获得积分10
2秒前
腰果虾仁完成签到 ,获得积分10
2秒前
刘zx完成签到,获得积分10
3秒前
亦景零枫完成签到 ,获得积分10
3秒前
零相似完成签到,获得积分10
4秒前
九月清晨完成签到,获得积分10
5秒前
5秒前
6秒前
高大怀梦完成签到 ,获得积分10
6秒前
叶子完成签到,获得积分10
7秒前
lys完成签到,获得积分10
7秒前
墨之默完成签到,获得积分10
8秒前
pluto应助tianshicanyi采纳,获得10
8秒前
舒适的雁风完成签到,获得积分10
8秒前
imica完成签到 ,获得积分10
8秒前
昏睡的沛柔完成签到 ,获得积分10
9秒前
当时只道是寻常完成签到,获得积分10
9秒前
我爱学习完成签到,获得积分10
9秒前
zanzan完成签到,获得积分10
9秒前
平常的青荷完成签到,获得积分10
10秒前
Fe_001完成签到 ,获得积分10
10秒前
富贵完成签到 ,获得积分10
10秒前
LHW完成签到,获得积分0
10秒前
灵巧水蓝完成签到 ,获得积分10
10秒前
茶茶完成签到,获得积分10
11秒前
专注灵凡完成签到,获得积分10
12秒前
风偏偏完成签到,获得积分10
12秒前
ee关闭了ee文献求助
12秒前
sssssssssss完成签到,获得积分10
12秒前
Hou完成签到,获得积分10
12秒前
青天鸟1989完成签到,获得积分10
13秒前
七子完成签到,获得积分10
13秒前
dx完成签到,获得积分10
13秒前
14秒前
zzx396完成签到,获得积分0
15秒前
15秒前
15秒前
文艺的青旋完成签到 ,获得积分10
15秒前
高分求助中
Thinking Small and Large 500
Algorithmic Mathematics in Machine Learning 500
Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300
The Monocyte-to-HDL ratio (MHR) as a prognostic and diagnostic biomarker in Acute Ischemic Stroke: A systematic review with meta-analysis (P9-14.010) 240
流量测量节流装置设计手册 200
Interpretability and Explainability in AI Using Python 200
SPECIAL FEATURES OF THE EXCHANGE INTERACTIONS IN ORTHOFERRITE-ORTHOCHROMITES 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3833976
求助须知:如何正确求助?哪些是违规求助? 3376399
关于积分的说明 10493130
捐赠科研通 3095905
什么是DOI,文献DOI怎么找? 1704778
邀请新用户注册赠送积分活动 820104
科研通“疑难数据库(出版商)”最低求助积分说明 771859