Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

自编码 计算机科学 机器学习 组学 人工智能 深度学习 生物信息学 数据挖掘 生物
作者
Hua Chai,Xiang Zhou,Zhongyue Zhang,Jiahua Rao,Huiying Zhao,Yuedong Yang
出处
期刊:Computers in Biology and Medicine [Elsevier BV]
卷期号:134: 104481-104481 被引量:100
标识
DOI:10.1016/j.compbiomed.2021.104481
摘要

Genomic information is nowadays widely used for precise cancer treatments. Since the individual type of omics data only represents a single view that suffers from data noise and bias, multiple types of omics data are required for accurate cancer prognosis prediction. However, it is challenging to effectively integrate multi-omics data due to the large number of redundant variables but relatively small sample size. With the recent progress in deep learning techniques, Autoencoder was used to integrate multi-omics data for extracting representative features. Nevertheless, the generated model is fragile from data noises. Additionally, previous studies usually focused on individual cancer types without making comprehensive tests on pan-cancer. Here, we employed the denoising Autoencoder to get a robust representation of the multi-omics data, and then used the learned representative features to estimate patients’ risks. By applying to 15 cancers from The Cancer Genome Atlas (TCGA), our method was shown to improve the C-index values over previous methods by 6.5% on average. Considering the difficulty to obtain multi-omics data in practice, we further used only mRNA data to fit the estimated risks by training XGboost models, and found the models could achieve an average C-index value of 0.627. As a case study, the breast cancer prognosis prediction model was independently tested on three datasets from the Gene Expression Omnibus (GEO), and shown able to significantly separate high-risk patients from low-risk ones (C-index>0.6, p-values<0.05). Based on the risk subgroups divided by our method, we identified nine prognostic markers highly associated with breast cancer, among which seven genes have been proved by literature review. Our comprehensive tests indicated that we have constructed an accurate and robust framework to integrate multi-omics data for cancer prognosis prediction. Moreover, it is an effective way to discover cancer prognosis-related genes.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
海慕云完成签到,获得积分10
2秒前
不渝发布了新的文献求助10
2秒前
1111完成签到 ,获得积分10
4秒前
4秒前
5秒前
科研通AI5应助vivi采纳,获得10
6秒前
6秒前
遇上就这样吧应助念念采纳,获得10
8秒前
weizheng完成签到,获得积分10
8秒前
9秒前
bkagyin应助Kuhaku采纳,获得10
10秒前
松风水月发布了新的文献求助30
11秒前
11秒前
永无终点完成签到,获得积分10
11秒前
无奈芮完成签到,获得积分10
11秒前
12秒前
老实的栾完成签到,获得积分10
13秒前
yumeng完成签到,获得积分10
13秒前
14秒前
14秒前
商毛毛发布了新的文献求助10
16秒前
晴天不下雨完成签到,获得积分10
16秒前
仔拎完成签到,获得积分10
16秒前
18秒前
栗子发布了新的文献求助10
18秒前
zhangshu发布了新的文献求助10
18秒前
19秒前
王王的苏发布了新的文献求助10
21秒前
FashionBoy应助apple_chan采纳,获得10
22秒前
科研通AI5应助贾舒涵采纳,获得10
24秒前
惊蛰完成签到,获得积分20
24秒前
斯文败类应助奔波儿灞采纳,获得10
25秒前
呜呜完成签到,获得积分10
25秒前
栗子完成签到,获得积分10
26秒前
27秒前
沁沁完成签到,获得积分10
27秒前
王王的苏完成签到,获得积分10
28秒前
29秒前
29秒前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Technologies supporting mass customization of apparel: A pilot project 450
China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
The Healthy Socialist Life in Maoist China, 1949–1980 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3793506
求助须知:如何正确求助?哪些是违规求助? 3338452
关于积分的说明 10289653
捐赠科研通 3054952
什么是DOI,文献DOI怎么找? 1676211
邀请新用户注册赠送积分活动 804255
科研通“疑难数据库(出版商)”最低求助积分说明 761806