微生物群
计算机科学
计算生物学
人工智能
数据挖掘
生物
生物信息学
作者
Xinyuan Shi,Fangfang Zhu,Wenwen Min
标识
DOI:10.1089/cmb.2024.0884
摘要
Predicting the survival outcomes and assessing the risk of patients play a pivotal role in comprehending the microbial composition across various stages of cancer. With the ongoing advancements in deep learning, it has been substantiated that deep learning holds the potential to analyze patient survival risks based on microbial data. However, confronting a common challenge in individual cancer datasets involves the limited sample size and the high dimensionality of the feature space. This predicament often leads to overfitting issues in deep learning models, hindering their ability to effectively extract profound data representations and resulting in suboptimal model performance. To overcome these challenges, we advocate the utilization of pretraining and fine-tuning strategies, which have proven effective in addressing the constraint of having a smaller sample size in individual cancer datasets. In this study, we propose a deep learning model that amalgamates Transformer encoder and variational autoencoder (VAE), VTrans, employing both pre-training and fine-tuning strategies to predict the survival risk of cancer patients using microbial data. Furthermore, we highlight the potential of extending VTrans to integrate microbial multi-omics data. Our method is assessed on three distinct cancer datasets from The Cancer Genome Atlas Program, and the research findings demonstrated that (1) VTrans excels in terms of performance compared to conventional machine learning and other deep learning models. (2) The utilization of pretraning significantly enhances its performance. (3) In contrast to positional encoding, employing VAE encoding proves to be more effective in enriching data representation. (4) Using the idea of saliency map, it is possible to observe which microbes have a high contribution to the classification results. These results demonstrate the effectiveness of VTrans in prediting patient survival risk. Source code and all datasets used in this paper are available at https://github.com/wenwenmin/VTrans and https://doi.org/10.5281/zenodo.14166580.
科研通智能强力驱动
Strongly Powered by AbleSci AI