清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

Evaluating the Ability of Open-Source Artificial Intelligence to Predict Accepting-Journal Impact Factor and Eigenfactor Score Using Academic Article Abstracts: Cross-sectional Machine Learning Analysis

冲击系数 计算机科学 引用 文献计量学 标点符号 人工智能 情报检索 预处理器 百分位 逻辑回归 机器学习 自然语言处理 医学物理学 数据挖掘 医学 万维网 统计 数学 政治学 法学
作者
Carmelo Macri,Stephen Bacchi,Sheng Chieh Teoh,Wan Yin Lim,Lydia Lam,Sandy Patel,Mark Slee,Robert J. Casson,WengOnn Chan
出处
期刊:Journal of Medical Internet Research 卷期号:25: e42789-e42789
标识
DOI:10.2196/42789
摘要

Background Strategies to improve the selection of appropriate target journals may reduce delays in disseminating research results. Machine learning is increasingly used in content-based recommender algorithms to guide journal submissions for academic articles. Objective We sought to evaluate the performance of open-source artificial intelligence to predict the impact factor or Eigenfactor score tertile using academic article abstracts. Methods PubMed-indexed articles published between 2016 and 2021 were identified with the Medical Subject Headings (MeSH) terms “ophthalmology,” “radiology,” and “neurology.” Journals, titles, abstracts, author lists, and MeSH terms were collected. Journal impact factor and Eigenfactor scores were sourced from the 2020 Clarivate Journal Citation Report. The journals included in the study were allocated percentile ranks based on impact factor and Eigenfactor scores, compared with other journals that released publications in the same year. All abstracts were preprocessed, which included the removal of the abstract structure, and combined with titles, authors, and MeSH terms as a single input. The input data underwent preprocessing with the inbuilt ktrain Bidirectional Encoder Representations from Transformers (BERT) preprocessing library before analysis with BERT. Before use for logistic regression and XGBoost models, the input data underwent punctuation removal, negation detection, stemming, and conversion into a term frequency-inverse document frequency array. Following this preprocessing, data were randomly split into training and testing data sets with a 3:1 train:test ratio. Models were developed to predict whether a given article would be published in a first, second, or third tertile journal (0-33rd centile, 34th-66th centile, or 67th-100th centile), as ranked either by impact factor or Eigenfactor score. BERT, XGBoost, and logistic regression models were developed on the training data set before evaluation on the hold-out test data set. The primary outcome was overall classification accuracy for the best-performing model in the prediction of accepting journal impact factor tertile. Results There were 10,813 articles from 382 unique journals. The median impact factor and Eigenfactor score were 2.117 (IQR 1.102-2.622) and 0.00247 (IQR 0.00105-0.03), respectively. The BERT model achieved the highest impact factor tertile classification accuracy of 75.0%, followed by an accuracy of 71.6% for XGBoost and 65.4% for logistic regression. Similarly, BERT achieved the highest Eigenfactor score tertile classification accuracy of 73.6%, followed by an accuracy of 71.8% for XGBoost and 65.3% for logistic regression. Conclusions Open-source artificial intelligence can predict the impact factor and Eigenfactor score of accepting peer-reviewed journals. Further studies are required to examine the effect on publication success and the time-to-publication of such recommender systems.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
gmc完成签到 ,获得积分10
40秒前
谷子完成签到 ,获得积分10
41秒前
数乱了梨花完成签到 ,获得积分10
49秒前
寒战完成签到 ,获得积分10
57秒前
沐浠完成签到 ,获得积分10
1分钟前
salty完成签到 ,获得积分10
1分钟前
桐桐应助123采纳,获得10
1分钟前
Shandongdaxiu完成签到 ,获得积分10
1分钟前
HCKACECE完成签到 ,获得积分10
1分钟前
Vicky完成签到 ,获得积分10
1分钟前
1分钟前
丰富的绮山完成签到,获得积分10
1分钟前
烟花应助科研通管家采纳,获得30
1分钟前
Akim应助科研通管家采纳,获得30
1分钟前
123发布了新的文献求助10
1分钟前
段采萱完成签到 ,获得积分10
1分钟前
Ava应助123采纳,获得30
1分钟前
leng完成签到 ,获得积分10
2分钟前
dragonhmw完成签到 ,获得积分10
2分钟前
烟花应助依居采纳,获得10
2分钟前
玉一一完成签到 ,获得积分10
2分钟前
哈拉斯完成签到,获得积分10
2分钟前
2分钟前
依居发布了新的文献求助10
2分钟前
蓝胖胖蓝完成签到,获得积分10
2分钟前
2分钟前
稀松完成签到,获得积分10
2分钟前
2分钟前
陌子完成签到 ,获得积分10
3分钟前
zjq完成签到 ,获得积分10
3分钟前
3分钟前
飞云完成签到 ,获得积分10
3分钟前
铁妹儿完成签到 ,获得积分10
3分钟前
SHD完成签到 ,获得积分10
3分钟前
煜琪完成签到 ,获得积分10
4分钟前
高贵逍遥完成签到 ,获得积分10
4分钟前
wyh295352318完成签到 ,获得积分10
4分钟前
jie完成签到 ,获得积分10
4分钟前
さくま完成签到,获得积分10
4分钟前
啥时候能早睡完成签到 ,获得积分10
4分钟前
高分求助中
좌파는 어떻게 좌파가 됐나:한국 급진노동운동의 형성과 궤적 2500
Sustainability in Tides Chemistry 1500
TM 5-855-1(Fundamentals of protective design for conventional weapons) 1000
Cognitive linguistics critical concepts in linguistics 800
Threaded Harmony: A Sustainable Approach to Fashion 799
Livre et militantisme : La Cité éditeur 1958-1967 500
氟盐冷却高温堆非能动余热排出性能及安全分析研究 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3052644
求助须知:如何正确求助?哪些是违规求助? 2709863
关于积分的说明 7418252
捐赠科研通 2354395
什么是DOI,文献DOI怎么找? 1246007
科研通“疑难数据库(出版商)”最低求助积分说明 605951
版权声明 595921