过度拟合
代谢组学
卷积神经网络
判别式
模式识别(心理学)
降维
主成分分析
特征(语言学)
源代码
维数之咒
机器学习
深度学习
支持向量机
随机森林
编码(集合论)
人工神经网络
人工智能
生物信息学
计算机科学
生物
语言学
哲学
集合(抽象数据类型)
程序设计语言
操作系统
作者
Yuyang Sha,Weiyu Meng,Gang Luo,Xiaobing Zhai,Henry H.Y. Tong,Yuefei Wang,Kefeng Li
标识
DOI:10.1021/acs.analchem.3c04607
摘要
Clinical metabolomics is growing as an essential tool for precision medicine. However, classical machine learning algorithms struggle to comprehensively encode and analyze the metabolomics data due to their high dimensionality and complex intercorrelations. This article introduces a new method called MetDIT, designed to analyze intricate metabolomics data effectively using deep convolutional neural networks (CNN). MetDIT comprises two components: TransOmics and NetOmics. Since CNN models have difficulty in processing one-dimensional (1D) sequence data efficiently, we developed TransOmics, a framework that transforms sequence data into two-dimensional (2D) images while maintaining a one-to-one correspondence between the sequences and images. NetOmics, the second component, leverages a CNN architecture to extract more discriminative representations from the transformed samples. To overcome the overfitting due to the small sample size and class imbalance, we introduced a feature augmentation module (FAM) and a loss function to improve the model performance. Furthermore, we systematically optimized the model backbone and image resolution to balance the model parameters and computational costs. To demonstrate the performance of the proposed MetDIT, we conducted extensive experiments using three different clinical metabolomics data sets and achieved better classification performance than classical machine learning methods used in metabolomics, including Random Forest, SVM, XGBoost, and LightGBM. The source code is available at the GitHub repository at https://github.com/Li-OmicsLab/MetDIT, and the WebApp can be found at http://metdit.bioinformatics.vip/.
科研通智能强力驱动
Strongly Powered by AbleSci AI