Evaluating the Ability of Open-Source Artificial Intelligence to Predict Accepting-Journal Impact Factor and Eigenfactor Score Using Academic Article Abstracts: Cross-sectional Machine Learning Analysis

冲击系数 计算机科学 引用 文献计量学 标点符号 人工智能 情报检索 预处理器 百分位 逻辑回归 机器学习 自然语言处理 医学物理学 数据挖掘 医学 万维网 统计 数学 法学 政治学
作者
Carmelo Macri,Stephen Bacchi,Sheng Chieh Teoh,Wan Yin Lim,Lydia Lam,Sandy Patel,Mark Slee,Robert J. Casson,WengOnn Chan
出处
期刊:Journal of Medical Internet Research [JMIR Publications]
卷期号:25: e42789-e42789
标识
DOI:10.2196/42789
摘要

Background Strategies to improve the selection of appropriate target journals may reduce delays in disseminating research results. Machine learning is increasingly used in content-based recommender algorithms to guide journal submissions for academic articles. Objective We sought to evaluate the performance of open-source artificial intelligence to predict the impact factor or Eigenfactor score tertile using academic article abstracts. Methods PubMed-indexed articles published between 2016 and 2021 were identified with the Medical Subject Headings (MeSH) terms “ophthalmology,” “radiology,” and “neurology.” Journals, titles, abstracts, author lists, and MeSH terms were collected. Journal impact factor and Eigenfactor scores were sourced from the 2020 Clarivate Journal Citation Report. The journals included in the study were allocated percentile ranks based on impact factor and Eigenfactor scores, compared with other journals that released publications in the same year. All abstracts were preprocessed, which included the removal of the abstract structure, and combined with titles, authors, and MeSH terms as a single input. The input data underwent preprocessing with the inbuilt ktrain Bidirectional Encoder Representations from Transformers (BERT) preprocessing library before analysis with BERT. Before use for logistic regression and XGBoost models, the input data underwent punctuation removal, negation detection, stemming, and conversion into a term frequency-inverse document frequency array. Following this preprocessing, data were randomly split into training and testing data sets with a 3:1 train:test ratio. Models were developed to predict whether a given article would be published in a first, second, or third tertile journal (0-33rd centile, 34th-66th centile, or 67th-100th centile), as ranked either by impact factor or Eigenfactor score. BERT, XGBoost, and logistic regression models were developed on the training data set before evaluation on the hold-out test data set. The primary outcome was overall classification accuracy for the best-performing model in the prediction of accepting journal impact factor tertile. Results There were 10,813 articles from 382 unique journals. The median impact factor and Eigenfactor score were 2.117 (IQR 1.102-2.622) and 0.00247 (IQR 0.00105-0.03), respectively. The BERT model achieved the highest impact factor tertile classification accuracy of 75.0%, followed by an accuracy of 71.6% for XGBoost and 65.4% for logistic regression. Similarly, BERT achieved the highest Eigenfactor score tertile classification accuracy of 73.6%, followed by an accuracy of 71.8% for XGBoost and 65.3% for logistic regression. Conclusions Open-source artificial intelligence can predict the impact factor and Eigenfactor score of accepting peer-reviewed journals. Further studies are required to examine the effect on publication success and the time-to-publication of such recommender systems.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
SDS完成签到,获得积分10
4秒前
稚气满满完成签到 ,获得积分10
4秒前
Ava应助Minguk采纳,获得10
5秒前
X519664508完成签到,获得积分0
6秒前
文献高手完成签到 ,获得积分10
6秒前
喜悦香薇完成签到 ,获得积分10
7秒前
9秒前
呆小仙完成签到,获得积分10
9秒前
李娇完成签到 ,获得积分10
12秒前
wxy完成签到,获得积分10
12秒前
炙热念双完成签到 ,获得积分10
13秒前
会飞的鱼完成签到,获得积分10
14秒前
14秒前
卞卞完成签到,获得积分10
18秒前
Minguk发布了新的文献求助10
19秒前
小蘑菇应助可口可乐采纳,获得10
20秒前
一一完成签到 ,获得积分10
20秒前
22秒前
科研通AI5应助Alger采纳,获得10
23秒前
青青完成签到,获得积分10
25秒前
木木完成签到,获得积分10
25秒前
小川发布了新的文献求助10
25秒前
Cell完成签到,获得积分10
25秒前
情怀应助lizhiqian2024采纳,获得10
26秒前
陶世立完成签到 ,获得积分10
27秒前
27秒前
Buduan完成签到,获得积分10
27秒前
gkhsdvkb完成签到 ,获得积分10
28秒前
31秒前
Leon完成签到,获得积分0
32秒前
启程牛牛完成签到,获得积分0
33秒前
RussHu发布了新的文献求助20
34秒前
可口可乐发布了新的文献求助10
34秒前
35秒前
扬帆起航完成签到 ,获得积分10
35秒前
张若旸完成签到 ,获得积分10
37秒前
天天发布了新的文献求助10
37秒前
月上柳梢头A1完成签到,获得积分10
38秒前
zero完成签到,获得积分10
39秒前
lsy完成签到,获得积分10
39秒前
高分求助中
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
Optical and electric properties of monocrystalline synthetic diamond irradiated by neutrons 320
共融服務學習指南 300
Essentials of Pharmacoeconomics: Health Economics and Outcomes Research 3rd Edition. by Karen Rascati 300
Peking Blues // Liao San 300
Political Ideologies Their Origins and Impact 13 edition 240
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3801065
求助须知:如何正确求助?哪些是违规求助? 3346581
关于积分的说明 10329750
捐赠科研通 3063074
什么是DOI,文献DOI怎么找? 1681341
邀请新用户注册赠送积分活动 807491
科研通“疑难数据库(出版商)”最低求助积分说明 763726