注释
相似性(几何)
语义相似性
空格(标点符号)
代谢物
计算机科学
自然语言处理
人工智能
语义空间
计算生物学
情报检索
化学
生物
生物化学
图像(数学)
操作系统
作者
Hongchao Ji,Ran Du,Qinliang Dai,Meifeng Su,Yaqing Lyu,Jianbin Yan
标识
DOI:10.1101/2024.05.30.596727
摘要
ABSTRACT Untargeted metabolomics, employing liquid chromatography‒mass spectrometry (LC-MS), allows quantification of known and unidentified metabolites within biological systems. However, in practical metabolomic studies, the majority of metabolites often remain unidentified. Here, we developed a novel deep learning-based metabolite annotation approach via semantic similarity analysis of mass spectral language. This approach enables the prediction of structurally related metabolites for unknown compounds. By considering the chemical space, these structurally related metabolites provide valuable information about the potential location of the unknown metabolites and assist in ranking candidates obtained from molecular structure databases. Validated with benchmark datasets, our method has consistently demonstrated superior performance compared to existing methods for metabolite annotation. In a case study involving the Qianxi cultivar cherry tomato, our approach reaffirmed well-established biomarkers of ripening processes and identified a set of promising and rational new biomarker metabolites. Overall, presented method exhibits significant potential in annotating metabolites, particularly in revealing the “dark matter” in untargeted metabolomics.
科研通智能强力驱动
Strongly Powered by AbleSci AI