计算机科学
结构化
领域(数学)
情报检索
知识图
语义学(计算机科学)
构造(python库)
领域知识
图形
数据科学
集合(抽象数据类型)
钥匙(锁)
人工智能
理论计算机科学
经济
纯数学
程序设计语言
计算机安全
数学
财务
作者
Mauro Dalle Lucca Tosi,Júlio Cesar dos Reis
标识
DOI:10.1016/j.joi.2020.101109
摘要
Understanding the structure of a scientific domain and extracting specific information from it is laborious. The high amount of manual effort required to this end indicates that the way knowledge has been structured and visualized until the present day should be improved in software tools. Nowadays, scientific domains are organized based on citation networks or bag-of-words techniques, disregarding the intrinsic semantics of concepts presented in literature documents. We propose a novel approach to structure scientific fields, which uses semantic analysis from natural language texts to construct knowledge graphs. Then, our approach clusters knowledge graphs in their main topics and automatically extracts information such as the most relevant concepts in topics and overlapping concepts between topics. We evaluate the proposed model in two datasets from distinct areas. The results achieve up to 84% of accuracy in the task of document classification without using annotated data to segment topics from a set of input documents. Our solution identifies coherent keyphrases and key concepts considering the dataset used. The SciKGraph framework contributes by structuring knowledge that might aid researchers in the study of their areas, reducing the effort and amount of time devoted to groundwork.
科研通智能强力驱动
Strongly Powered by AbleSci AI