HateCircle and Unsupervised Hate Speech Detection Incorporating Emotion and Contextual Semantics

孟加拉语 印地语 计算机科学 情绪检测 词典 自然语言处理 人工智能 社会化媒体 语义学(计算机科学) 编码(集合论) 语音活动检测 语音识别 语音处理 情绪识别 万维网 程序设计语言 集合(抽象数据类型)
作者
Sayani Ghosal,Amita Jain
出处
期刊:ACM Transactions on Asian and Low-Resource Language Information Processing 卷期号:22 (4): 1-28 被引量:18
标识
DOI:10.1145/3576913
摘要

The explosive growth of social media has fueled an extensive increase in online freedom of speech. The worldwide platform of human voice creates possibilities to assail other users without facing any consequences, and flout social etiquettes, resulting in an inevitable increase of hate speech. Nowadays, English hate speech detection is a popular research area, but the prevalence of implicit hate content in regional languages desire effective language-independent models. The proposed research is the first unsupervised Hindi and Bengali hate content detection framework consisting of three significant concepts: HateCircle, hate tweet classification, and code-switch data preparation algorithms. The novel HateCircle method is proposed to detect hate orientation for each term by co-occurrence patterns of words, contextual semantics, and emotion analysis. The efficient multiclass hate tweet classification algorithm is proposed with parts of speech tagging, Euclidean distance, and the Geometric median methods. The detection of hate content is more efficient in the native script compared to the Roman script, so the transliteration algorithm is also proposed for code-switch data preparation. The experimentation evaluates the combination of various lexicons with our enriched hate lexicon that achieves a maximum of 0.74 F1-score for the Hindi and 0.88 F1-score for the Bengali datasets. The novel HateCircle and hate tweet detection framework evaluates with our proposed parts of speech tagging and Geometric median detection methods. Results reveal that HateCircle and hate tweet detection framework also achieves a maximum of 0.73 accuracy for the Hindi and 0.78 accuracy for the Bengali dataset. The experiment results signify that contextual semantic hate speech detection research with a language-independency feature offsets the growth of implicit abusive text in social media.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
橙果果发布了新的文献求助10
2秒前
111发布了新的文献求助20
2秒前
能干的友琴应助bling采纳,获得10
3秒前
4秒前
4秒前
4秒前
桐桐应助我哩个都采纳,获得10
6秒前
HZZ完成签到,获得积分10
6秒前
Owen应助科研通管家采纳,获得10
7秒前
CipherSage应助科研通管家采纳,获得10
7秒前
7秒前
7秒前
李健应助科研通管家采纳,获得30
7秒前
8秒前
NexusExplorer应助科研通管家采纳,获得10
8秒前
桐桐应助科研通管家采纳,获得10
8秒前
orixero应助科研通管家采纳,获得10
8秒前
Orange应助科研通管家采纳,获得10
8秒前
斯文败类应助科研通管家采纳,获得10
8秒前
科研通AI2S应助科研通管家采纳,获得30
8秒前
汉堡包应助科研通管家采纳,获得10
8秒前
领导范儿应助科研通管家采纳,获得10
8秒前
8秒前
8秒前
彭于晏应助科研通管家采纳,获得10
8秒前
bkagyin应助科研通管家采纳,获得10
8秒前
深情安青应助科研通管家采纳,获得10
8秒前
英姑应助科研通管家采纳,获得10
8秒前
CipherSage应助科研通管家采纳,获得10
8秒前
9秒前
9秒前
自由老头应助科研通管家采纳,获得20
9秒前
orixero应助科研通管家采纳,获得10
9秒前
NexusExplorer应助科研通管家采纳,获得10
9秒前
9秒前
9秒前
9秒前
ok123完成签到 ,获得积分0
9秒前
9秒前
pan完成签到,获得积分10
9秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Development Across Adulthood 800
Chemistry and Physics of Carbon Volume 18 800
The Organometallic Chemistry of the Transition Metals 800
The formation of Australian attitudes towards China, 1918-1941 640
Signals, Systems, and Signal Processing 610
天津市智库成果选编 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6445477
求助须知:如何正确求助?哪些是违规求助? 8259127
关于积分的说明 17594057
捐赠科研通 5505635
什么是DOI,文献DOI怎么找? 2901729
邀请新用户注册赠送积分活动 1878735
关于科研通互助平台的介绍 1718642