计算机科学
杠杆(统计)
人工智能
推论
知识图
可扩展性
可信赖性
人工智能应用
机器学习
自然语言处理
知识库
范围(计算机科学)
数据科学
信息抽取
图形
计算模型
答疑
数据建模
知识抽取
语言模型
数据结构
计算语言学
源代码
专家系统
分类
数据源
自然语言
基于知识的系统
方案(数学)
情报检索
数据类型
语言建构
知识整合
主题模型
实验数据
作者
Thomas Michael Pruyn,Amro Aswad,Sartaaj Takrim Khan,Ju Huang,Robert Black,Seyed Mohamad Moosavi
摘要
Artificial intelligence (AI) is transforming research in metal-organic frameworks (MOFs), where models trained on structured computational data routinely predict new materials and optimize their properties. This raises a central question: What if we could leverage the full breadth of MOF knowledge, not just structured data sets, but also the scientific literature? For researchers, the literature remains the primary source of knowledge, yet much of its content, including experimental data and expert insight, remains underutilized by AI systems. We introduce MOF-ChemUnity, a structured, extensible, and scalable knowledge graph that unifies MOF data by linking literature-derived insights to crystal structures and computational data sets. By disambiguating MOF names in the literature and connecting them to crystal structures in the Cambridge Structural Database, MOF-ChemUnity unifies experimental and computational sources and enables cross-document knowledge extraction and linking. We showcase how this enables multiproperty machine learning across simulated and experimental data, compilation of complete synthesis records for individual compounds by aggregating information across multiple publications, and expert-guided materials recommendations via structure-based machine learning descriptors for pore geometry and chemistry. When used as a knowledge source to augment large language models (LLMs), MOF-ChemUnity enables a literature-informed AI assistant that operates over the full scope of MOF knowledge. Expert evaluations show improved accuracy, interpretability, and trustworthiness across tasks such as retrieval, inference of structure-property relationships, and materials recommendation, outperforming standard LLMs. This work lays the foundation for literature-informed materials discovery, enabling both scientists and AI systems to reason over the full existing knowledge in a new way.
科研通智能强力驱动
Strongly Powered by AbleSci AI