计算生物学
基因
癌症
生物
分类器(UML)
保守序列
乳腺癌
微阵列分析技术
DNA微阵列
鉴定(生物学)
生物信息学
计算机科学
遗传学
人工智能
基因表达
植物
基序列
作者
Chunyu Lin,Peiying Ruan,Ruiming Li,Jinn-Moon Yang,Simon See,Jiangning Song,Tatsuya Akutsu
标识
DOI:10.1142/s0219720019400055
摘要
Cancer subtype identification is an unmet need in precision diagnosis. Recently, evolutionary conservation has been indicated to contain informative signatures for functional significance in cancers. However, the importance of evolutionary conservation in distinguishing cancer subtypes remains largely unclear. Here, we identified the evolutionarily conserved genes (i.e. core genes) and observed that they are primarily involved in cellular pathways relevant to cell growth and metabolisms. By using these core genes, we developed two novel strategies, namely a feature-based strategy (FES) and an image-based strategy (IMS) by integrating their evolutionary and genomic profiles with the deep learning algorithm. In comparison with the FES using the random set and the strategy using the PAM50 classifier, the core gene set-based FES achieved a higher accuracy for identifying breast cancer subtypes. The IMS and FES using the core gene set yielded better performances than the other strategies, in terms of classifying both breast cancer subtypes and multiple cancer types. Moreover, the IMS is reproducible even using different gene expression data (i.e. RNA-seq and microarray). Comprehensive analysis of eight cancer types demonstrates that our evolutionary conservation-based models represent a valid and helpful approach for identifying cancer subtypes and the core gene set offers distinguishable clues of cancer subtypes.
科研通智能强力驱动
Strongly Powered by AbleSci AI