暗物质
代谢组学
计算生物学
数据科学
化学
生物
物理
计算机科学
生物信息学
天文
作者
Ricardo R. da Silva,Pieter C. Dorrestein,Robert A. Quinn
标识
DOI:10.1073/pnas.1516878112
摘要
Despite the over 100-y history of mass spectrometry, it remains challenging to link the large volume of known chemical structures to the data obtained with mass spectrometers. Presently, only 1.8% of spectra in an untargeted metabolomics experiment can be annotated. This means that the vast majority of information collected by metabolomics is “dark matter,” chemical signatures that remain uncharacterized (Fig. 1). For a genomic comparison, 80% of predicted genes in the Escherichia coli genome are known. In a bacteriophage metagenome, a well-known frontier of biological dark matter, the amount of known genes is 1–30%, depending on the sample (1). Thus, one could argue that we know more about the genetics of uncultured phage than we do about the chemistry within our own bodies. Much of the chemical dark matter may include known structures, but they remain undiscovered because the reference spectra are not available in mass spectrometry databases. The only way to overcome this challenge is through the development of computational solutions. In PNAS, Duhrkop et al. describe the development of such a computational tool, called CSI (compound structure identification):FingerID (2). The tool is designed to aid in the annotation of chemistries that can be observed by mass spectrometry. CSI:FingerID uses fragmentation trees to connect tandem MS (MS/MS) data to chemical structures found in public chemistry databases. Tools such as this can allow metabolomics with mass spectrometry to become as commonly used and scientifically productive as sequencing technologies have in the field of genomics.
科研通智能强力驱动
Strongly Powered by AbleSci AI