冗余(工程)
集合(抽象数据类型)
基因
数据库
计算机科学
计算生物学
生物
数据挖掘
遗传学
操作系统
程序设计语言
作者
Arthur Liberzon,Chet Birger,Helga Thorvaldsdóttir,Mahmoud Ghandi,Jill P. Mesirov,Pablo Tamayo
出处
期刊:Cell systems
[Elsevier]
日期:2015-12-01
卷期号:1 (6): 417-425
被引量:6898
标识
DOI:10.1016/j.cels.2015.12.004
摘要
The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of "hallmark" gene sets as part of MSigDB. Each hallmark in this collection consists of a "refined" gene set, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI