BiGRU attention capsule neural network for persian text classification

波斯人 计算机科学 人工智能 人工神经网络 主题(文档) 深度学习 计算智能 自然语言处理 钥匙(锁) 机器学习 语言学 万维网 计算机安全 哲学
作者
Amir Kenarang,Mehrdad Farahani,Mohammad Manthouri
出处
期刊:Journal of Ambient Intelligence and Humanized Computing [Springer Nature]
卷期号:13 (8): 3923-3933 被引量:6
标识
DOI:10.1007/s12652-022-03742-y
摘要

Text classification is a significant part of the business world. In the news classification world, detection of the subject is an important issue that can lead to the recognition of news trends and junk news. There are different algorithms of deep learning to process text classification. In this paper, specific algorithms have been implemented and compared to obtain the subject of the text in the Persian news corpus. The best results belong to BiGRU with the attention mechanism and CapsNet (BiGRUACaps) method. The GRU network outperforms LSTM because of fewer gates and, therefore, fewer parameters. In the GRU, the flow control is done without a memory unit, and this network has shown that it has better performance in case of existing less data. Moreover, given that long sentences are used in the news texts, the existence of the attention mechanism has made important words more relevant and has solved the problem in the long sequences data. The most significant problem in classifying Persian texts was the lack of a suitable dataset. One of the contributions of this work is scraped data. Collecting 20,726 records from Persian news websites is the best Persian news dataset with the category. Due to the lack of appropriate pre-trained Persian models and also the combination of various neural networks with these models, and determining the optimal model to identify the subject of Persian text, has been another problem of this research. The use of Model CapsNet in Persian data has also been looked into, which has had exciting results. The results of the comparison show the improvement of the classification performance of the Persian texts. The best result obtained the combination of BiGRUACaps with 0.8608 in F Measure
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Wang完成签到 ,获得积分20
刚刚
Xu完成签到 ,获得积分10
刚刚
1秒前
JJ完成签到 ,获得积分10
1秒前
电子屎壳郎完成签到,获得积分10
2秒前
深情安青应助guojingjing采纳,获得10
6秒前
闪闪的荔枝完成签到 ,获得积分10
7秒前
8秒前
15秒前
wang5945发布了新的文献求助10
15秒前
sunflower完成签到 ,获得积分10
16秒前
Leonardi完成签到 ,获得积分0
16秒前
guojingjing发布了新的文献求助10
20秒前
小小少年发布了新的文献求助10
22秒前
zy完成签到 ,获得积分10
23秒前
哈哈哈hey完成签到 ,获得积分10
25秒前
我爱Chem完成签到 ,获得积分10
25秒前
左丘以云完成签到,获得积分10
30秒前
英俊的铭应助鲸鱼采纳,获得10
34秒前
mark33442完成签到,获得积分10
35秒前
39秒前
41秒前
YQ驳回了秋雪瑶应助
43秒前
Amancio118完成签到 ,获得积分10
49秒前
自然之水完成签到,获得积分10
49秒前
123发布了新的文献求助10
49秒前
巴拉巴拉巴拉完成签到 ,获得积分10
51秒前
dominic12361完成签到 ,获得积分10
59秒前
Cici的新长征完成签到 ,获得积分10
1分钟前
张家辉是卧底完成签到 ,获得积分10
1分钟前
1分钟前
wangxiaobin完成签到 ,获得积分10
1分钟前
小小少年发布了新的文献求助10
1分钟前
果冻完成签到,获得积分10
1分钟前
不安青牛完成签到,获得积分0
1分钟前
向阳生长的花完成签到,获得积分10
1分钟前
haochi完成签到,获得积分10
1分钟前
紫金大萝卜应助有人采纳,获得20
1分钟前
绿茶完成签到 ,获得积分10
1分钟前
英姑应助科研通管家采纳,获得10
1分钟前
高分求助中
Teaching Social and Emotional Learning in Physical Education 900
Gymnastik für die Jugend 600
Chinese-English Translation Lexicon Version 3.0 500
Electronic Structure Calculations and Structure-Property Relationships on Aromatic Nitro Compounds 500
マンネンタケ科植物由来メロテルペノイド類の網羅的全合成/Collective Synthesis of Meroterpenoids Derived from Ganoderma Family 500
[Lambert-Eaton syndrome without calcium channel autoantibodies] 440
Plesiosaur extinction cycles; events that mark the beginning, middle and end of the Cretaceous 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2384458
求助须知:如何正确求助?哪些是违规求助? 2091335
关于积分的说明 5258025
捐赠科研通 1818235
什么是DOI,文献DOI怎么找? 906983
版权声明 559089
科研通“疑难数据库(出版商)”最低求助积分说明 484289