BiGRU attention capsule neural network for persian text classification

波斯人 计算机科学 人工智能 人工神经网络 主题(文档) 深度学习 计算智能 自然语言处理 钥匙(锁) 机器学习 语言学 万维网 哲学 计算机安全
作者
Amir Kenarang,Mehrdad Farahani,Mohammad Manthouri
出处
期刊:Journal of Ambient Intelligence and Humanized Computing [Springer Science+Business Media]
卷期号:13 (8): 3923-3933 被引量:6
标识
DOI:10.1007/s12652-022-03742-y
摘要

Text classification is a significant part of the business world. In the news classification world, detection of the subject is an important issue that can lead to the recognition of news trends and junk news. There are different algorithms of deep learning to process text classification. In this paper, specific algorithms have been implemented and compared to obtain the subject of the text in the Persian news corpus. The best results belong to BiGRU with the attention mechanism and CapsNet (BiGRUACaps) method. The GRU network outperforms LSTM because of fewer gates and, therefore, fewer parameters. In the GRU, the flow control is done without a memory unit, and this network has shown that it has better performance in case of existing less data. Moreover, given that long sentences are used in the news texts, the existence of the attention mechanism has made important words more relevant and has solved the problem in the long sequences data. The most significant problem in classifying Persian texts was the lack of a suitable dataset. One of the contributions of this work is scraped data. Collecting 20,726 records from Persian news websites is the best Persian news dataset with the category. Due to the lack of appropriate pre-trained Persian models and also the combination of various neural networks with these models, and determining the optimal model to identify the subject of Persian text, has been another problem of this research. The use of Model CapsNet in Persian data has also been looked into, which has had exciting results. The results of the comparison show the improvement of the classification performance of the Persian texts. The best result obtained the combination of BiGRUACaps with 0.8608 in F Measure

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
梅子酒发布了新的文献求助10
刚刚
高海龙完成签到 ,获得积分10
1秒前
Scorpia112应助某国采纳,获得10
2秒前
科研通AI6.1应助Nyx采纳,获得10
2秒前
4秒前
羊羊完成签到,获得积分10
4秒前
SmileLin完成签到,获得积分10
5秒前
丘比特应助机智的无心采纳,获得10
5秒前
6秒前
忐忑的八宝粥完成签到,获得积分10
6秒前
Sunmmon发布了新的文献求助30
7秒前
田様应助my196755采纳,获得10
8秒前
研友_ZGD9o8完成签到,获得积分10
8秒前
远看寒山完成签到,获得积分10
8秒前
shiqi完成签到,获得积分10
8秒前
叶子完成签到 ,获得积分10
9秒前
SmileLin完成签到,获得积分10
9秒前
科研通AI6.1应助梅子酒采纳,获得10
10秒前
10秒前
10秒前
Merge发布了新的文献求助10
11秒前
12秒前
领导范儿应助闪电先生采纳,获得10
13秒前
13秒前
13秒前
飞雪完成签到,获得积分10
14秒前
holder完成签到,获得积分10
14秒前
干羞花完成签到,获得积分10
15秒前
WYB完成签到 ,获得积分10
16秒前
大模型应助mobai采纳,获得30
17秒前
17秒前
qwert118应助haiwei采纳,获得10
20秒前
Darcy完成签到,获得积分10
20秒前
20秒前
天天应助科研通管家采纳,获得10
20秒前
Orange应助科研通管家采纳,获得10
20秒前
asdfzxcv应助科研通管家采纳,获得10
20秒前
星辰大海应助科研通管家采纳,获得10
21秒前
今后应助科研通管家采纳,获得10
21秒前
FashionBoy应助科研通管家采纳,获得10
21秒前
高分求助中
Clinical Epidemiology: The Essentials, 6e 10000
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
The Graphene Handbook (2019 Edition) 800
Adhesion Science: Principles & Practice 800
Signals, Systems, and Signal Processing 610
Fundamentals of Pharmaceutical and Biologics Regulations: A Global Perspective, Second Edition 600
The Immune System (Fifth Edition) 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6558542
求助须知:如何正确求助?哪些是违规求助? 8341845
关于积分的说明 17872730
捐赠科研通 5678115
什么是DOI,文献DOI怎么找? 2941147
邀请新用户注册赠送积分活动 1916992
关于科研通互助平台的介绍 1788433