生物
转录组
计算生物学
基因
长非编码RNA
基因组
遗传学
谱系(遗传)
人类基因组
核糖核酸
基因组学
RNA序列
基因表达
作者
Matthew K. Iyer,Yashar S. Niknafs,Rohit Malik,Udit Singhal,Anirban Sahu,Yasuyuki Hosono,Terrence R. Barrette,John R. Prensner,Joseph R. Evans,Shuang G. Zhao,Anton Poliakov,Xuhong Cao,Saravana M. Dhanasekaran,Yi-Mi Wu,Dan Robinson,David G. Beer,Felix Y. Feng,Hariharan Iyer,Arul M. Chinnaiyan
出处
期刊:Nature Genetics
[Nature Portfolio]
日期:2015-01-19
卷期号:47 (3): 199-208
被引量:2367
摘要
Long noncoding RNAs (lncRNAs) are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, we curated 7,256 RNA sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. We applied ab initio assembly methodology to this data set, yielding a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 1% (597) of the lncRNAs harbored ultraconserved elements, and 7% (3,900) overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, we employed non-parametric differential expression testing and nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.
科研通智能强力驱动
Strongly Powered by AbleSci AI