Evaluating the Ability of Open-Source Artificial Intelligence to Predict Accepting-Journal Impact Factor and Eigenfactor Score Using Academic Article Abstracts: Cross-sectional Machine Learning Analysis

冲击系数 计算机科学 引用 文献计量学 标点符号 人工智能 情报检索 预处理器 百分位 逻辑回归 机器学习 自然语言处理 医学物理学 数据挖掘 医学 万维网 统计 数学 政治学 法学
作者
Carmelo Macri,Stephen Bacchi,Sheng Chieh Teoh,Wan Yin Lim,Lydia Lam,Sandy Patel,Mark Slee,Robert J. Casson,WengOnn Chan
出处
期刊:Journal of Medical Internet Research 卷期号:25: e42789-e42789
标识
DOI:10.2196/42789
摘要

Background Strategies to improve the selection of appropriate target journals may reduce delays in disseminating research results. Machine learning is increasingly used in content-based recommender algorithms to guide journal submissions for academic articles. Objective We sought to evaluate the performance of open-source artificial intelligence to predict the impact factor or Eigenfactor score tertile using academic article abstracts. Methods PubMed-indexed articles published between 2016 and 2021 were identified with the Medical Subject Headings (MeSH) terms “ophthalmology,” “radiology,” and “neurology.” Journals, titles, abstracts, author lists, and MeSH terms were collected. Journal impact factor and Eigenfactor scores were sourced from the 2020 Clarivate Journal Citation Report. The journals included in the study were allocated percentile ranks based on impact factor and Eigenfactor scores, compared with other journals that released publications in the same year. All abstracts were preprocessed, which included the removal of the abstract structure, and combined with titles, authors, and MeSH terms as a single input. The input data underwent preprocessing with the inbuilt ktrain Bidirectional Encoder Representations from Transformers (BERT) preprocessing library before analysis with BERT. Before use for logistic regression and XGBoost models, the input data underwent punctuation removal, negation detection, stemming, and conversion into a term frequency-inverse document frequency array. Following this preprocessing, data were randomly split into training and testing data sets with a 3:1 train:test ratio. Models were developed to predict whether a given article would be published in a first, second, or third tertile journal (0-33rd centile, 34th-66th centile, or 67th-100th centile), as ranked either by impact factor or Eigenfactor score. BERT, XGBoost, and logistic regression models were developed on the training data set before evaluation on the hold-out test data set. The primary outcome was overall classification accuracy for the best-performing model in the prediction of accepting journal impact factor tertile. Results There were 10,813 articles from 382 unique journals. The median impact factor and Eigenfactor score were 2.117 (IQR 1.102-2.622) and 0.00247 (IQR 0.00105-0.03), respectively. The BERT model achieved the highest impact factor tertile classification accuracy of 75.0%, followed by an accuracy of 71.6% for XGBoost and 65.4% for logistic regression. Similarly, BERT achieved the highest Eigenfactor score tertile classification accuracy of 73.6%, followed by an accuracy of 71.8% for XGBoost and 65.3% for logistic regression. Conclusions Open-source artificial intelligence can predict the impact factor and Eigenfactor score of accepting peer-reviewed journals. Further studies are required to examine the effect on publication success and the time-to-publication of such recommender systems.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
我有我风格完成签到 ,获得积分10
刚刚
xmhxpz完成签到,获得积分10
12秒前
leoan完成签到,获得积分10
13秒前
llewis完成签到 ,获得积分10
13秒前
chen完成签到 ,获得积分10
16秒前
记忆完成签到,获得积分10
17秒前
科研打工人完成签到,获得积分10
19秒前
wlp鹏完成签到,获得积分10
23秒前
皇甫晓槐完成签到 ,获得积分10
23秒前
科研汪完成签到,获得积分10
26秒前
霓裳舞完成签到,获得积分10
27秒前
34秒前
汤绮菱完成签到,获得积分10
36秒前
阿尔法完成签到,获得积分10
37秒前
lalala应助科研通管家采纳,获得10
42秒前
Autin应助科研通管家采纳,获得10
42秒前
42秒前
lalala应助科研通管家采纳,获得10
42秒前
Orange应助科研通管家采纳,获得30
42秒前
星辰大海应助科研通管家采纳,获得10
42秒前
42秒前
xiaojiu完成签到,获得积分0
45秒前
诸葛书虫完成签到 ,获得积分10
46秒前
lbt1686666发布了新的文献求助10
49秒前
lee完成签到,获得积分10
49秒前
隐形曼青应助lbt1686666采纳,获得10
58秒前
MiManchi完成签到,获得积分10
59秒前
迷路安雁完成签到 ,获得积分10
1分钟前
pilifeng完成签到 ,获得积分10
1分钟前
littleE完成签到 ,获得积分10
1分钟前
开心飞飞完成签到,获得积分10
1分钟前
cqnuly完成签到,获得积分10
1分钟前
1分钟前
医生小白完成签到 ,获得积分10
1分钟前
1分钟前
慧19960418发布了新的文献求助10
1分钟前
蝃蝀完成签到,获得积分10
1分钟前
Huaiman完成签到,获得积分10
1分钟前
zxh656691发布了新的文献求助10
1分钟前
今天也要好好学习完成签到,获得积分10
1分钟前
高分求助中
请在求助之前详细阅读求助说明!!!! 20000
Sphäroguß als Werkstoff für Behälter zur Beförderung, Zwischen- und Endlagerung radioaktiver Stoffe - Untersuchung zu alternativen Eignungsnachweisen: Zusammenfassender Abschlußbericht 1500
One Man Talking: Selected Essays of Shao Xunmei, 1929–1939 1000
Yuwu Song, Biographical Dictionary of the People's Republic of China 700
[Lambert-Eaton syndrome without calcium channel autoantibodies] 520
The Three Stars Each: The Astrolabes and Related Texts 500
A radiographic standard of reference for the growing knee 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2468905
求助须知:如何正确求助?哪些是违规求助? 2136223
关于积分的说明 5442926
捐赠科研通 1860799
什么是DOI,文献DOI怎么找? 925477
版权声明 562694
科研通“疑难数据库(出版商)”最低求助积分说明 495093