Measuring the Novelty of Scientific Literature Through Contribution Sentence Analysis Using Deep Learning and Cloud Model

新颖性 云计算 判决 深度学习 计算机科学 自然语言处理 人工智能 数据科学 语言学 心理学 哲学 社会心理学 操作系统
作者
Zhongyi Wang,Haoxuan Zhang,Jiangping Chen,Haihua Chen
标识
DOI:10.2139/ssrn.4360535
摘要

The ex-ante novelty measurement of scientific literature is an essential tool for academic data mining and scientific communication. It can help researchers and peer experts quickly identify highly creative articles among a large number of papers. This paper proposes a framework for novelty measurement of scientific literature based on contribution sentence analysis. In the framework, to obtain the best models for contribution sentence identification and classification, we first implement eight state-of-the-art deep learning models, and compare their performances on contribution sentence identification and classification respectively. The selected contribution sentence identification model achieves the best recall and F1 scores, whose values are 0.963, and 0.929, respectively. The best contribution sentence classification model score 0.897 on Micro F1. Second, to represent each contribution sentence, we generate the contribution sentence cloud in the second part using the BERTopic model and the backward normal cloud generator. In the third part, we calculate the novelty scores of scientific literature using the cloud similarity algorithm. Finally, with the gold standard constructed manually, we perform three comparative experiments with the semantic novelty measurement on the International Conference on Learning Representations (ICLR 2017-2022) dataset. In terms of the correlation analysis results, our measurement has a bigger correlation coefficient with the gold standard than the semantic novelty measurement (0.805>0.580) at a p-value less than 0.0001. In the distribution of differences from the gold standard, our measurement has 2,584 (79.2%) articles falling within the range of ±1.5, compared to 1,519 (46.6%) articles for the semantic novelty measurement. As for boxplots, the results of our measurement are also closer to the gold standard than the semantic novelty measurement. The above experimental results show that our measurement is more feasible and effective than the semantic novelty measurement. Our framework benefits several communities, such as researchers, librarians, science evaluation institutions, policymakers, funding agencies, and others.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
小慧完成签到,获得积分10
2秒前
落后猫咪发布了新的文献求助10
3秒前
Liu完成签到 ,获得积分10
3秒前
汤圆发布了新的文献求助10
4秒前
song发布了新的文献求助10
4秒前
南枝完成签到,获得积分10
4秒前
4秒前
小王同学完成签到,获得积分10
5秒前
ZZZkn完成签到,获得积分10
5秒前
蓝莓小蛋糕完成签到 ,获得积分10
5秒前
邹雄辉发布了新的文献求助10
6秒前
浮游应助科研通管家采纳,获得10
6秒前
科研通AI6应助科研通管家采纳,获得10
6秒前
CodeCraft应助科研通管家采纳,获得10
7秒前
深情安青应助科研通管家采纳,获得10
7秒前
科研通AI6应助科研通管家采纳,获得10
7秒前
7秒前
SciGPT应助科研通管家采纳,获得10
7秒前
科研通AI6应助科研通管家采纳,获得10
7秒前
彭于晏应助科研通管家采纳,获得10
7秒前
iNk应助科研通管家采纳,获得10
7秒前
星辰大海应助科研通管家采纳,获得10
7秒前
大个应助科研通管家采纳,获得20
7秒前
Hello应助科研通管家采纳,获得10
7秒前
烟花应助科研通管家采纳,获得10
8秒前
丘比特应助科研通管家采纳,获得10
8秒前
科研通AI6应助科研通管家采纳,获得10
8秒前
科研通AI6应助科研通管家采纳,获得10
8秒前
科研通AI6应助科研通管家采纳,获得10
8秒前
8秒前
英姑应助科研通管家采纳,获得10
8秒前
科研通AI6应助科研通管家采纳,获得10
8秒前
科研通AI5应助科研通管家采纳,获得10
8秒前
科研通AI6应助科研通管家采纳,获得10
8秒前
wanci应助科研通管家采纳,获得10
9秒前
科研通AI6应助科研通管家采纳,获得10
9秒前
完美世界应助科研通管家采纳,获得10
9秒前
领导范儿应助科研通管家采纳,获得10
9秒前
共享精神应助科研通管家采纳,获得10
9秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Irregular Migration in Southeast Asia: Contemporary Barriers to Regularization and Healthcare 2000
Acute Mountain Sickness 2000
Cowries - A Guide to the Gastropod Family Cypraeidae 1200
Handbook of Milkfat Fractionation Technology and Application, by Kerry E. Kaylegian and Robert C. Lindsay, AOCS Press, 1995 1000
Handbook of Ethics and Social Psychology 700
A novel angiographic index for predicting the efficacy of drug-coated balloons in small vessels 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 5056050
求助须知:如何正确求助?哪些是违规求助? 4281732
关于积分的说明 13343606
捐赠科研通 4098591
什么是DOI,文献DOI怎么找? 2243705
邀请新用户注册赠送积分活动 1249739
关于科研通互助平台的介绍 1180107