化学
化学空间
表征(材料科学)
化学位移
核磁共振谱数据库
核磁共振波谱
集合(抽象数据类型)
数据集
生物系统
二维核磁共振波谱
分子
水准点(测量)
计算化学
质子核磁共振
实验数据
化学结构
核磁共振晶体学
碳-13核磁共振
空格(标点符号)
化学物理
生成模型
量子化学
纳米技术
任务(项目管理)
谱线
小分子
组合化学
共振(粒子物理)
作者
Xi Xue,Hanyu Sun,Jingying Sun,Luc Patiny,Xiangying Liu,Kai Chen,Jingjie Yan,Liangning Li,Xue Liu,Shu Xu,Dongming Zhang,Yafeng Deng,Yingda Zang,Ya‐Ling Gong,Jie Ma,Xiaojian Wang
标识
DOI:10.1021/acs.analchem.5c03783
摘要
Nuclear magnetic resonance (NMR) data provides rich quantum information on molecular structure, which is closely related to chemical structure and widely used for structural characterization in chemical discovery. Despite substantial advances in spectral analysis techniques, few existing models have demonstrated satisfactory performance in accurate NMR interpretation. Herein, we introduce NMRMind, a Transformer-based generative framework that directly elucidates molecular structures from NMR spectral data. NMRMind was pretrained on a data set comprising 45 million 1D NMR spectra and subsequently fine-tuned on a self-curated benchmark consisting of 2.2 million 1D and 2D NMR spectra. Using a mixed-modality dropout strategy during training, NMRMind achieved excellent performance, attaining a Top-1 accuracy of 92.07% across all input conditions on the structure elucidation task with a speed of <0.05 s per elucidation. Additionally, NMRMind maintained a Top-1 accuracy of 85.10% when only one-dimensional and two-dimensional NMR data were used as input, without considering molecular formulas or fragments. Moreover, the application of NMRMind facilitated the discovery of six previously uncharacterized natural products from Magnolia officinalis and successfully elucidated the structures of six unexpected products resulting from synthetic reactions, thereby expanding the accessible chemical space and providing novel insights into chemical mechanisms. These results demonstrate that NMRMind is a powerful and generalizable platform for chemistry research.
科研通智能强力驱动
Strongly Powered by AbleSci AI