计算机科学
可解释性
人工智能
自然语言处理
判决
语义学(计算机科学)
图形
编码器
超链接
编码
情报检索
万维网
理论计算机科学
网页
生物化学
化学
基因
程序设计语言
操作系统
作者
Tong Liu,Ke Yu,Lu Wang,Xuanyu Zhang,Zhou Hao,Xiaofei Wu
标识
DOI:10.1016/j.knosys.2022.108605
摘要
In online social media, there is a large amount of clickbait using various tricks such as curious words and well-designed sentence structures, to attract users to click on hyperlinks for unknown benefits. Clickbait detection aims to detect these hyperlinks through automated algorithms. Previous researches usually focus on the semantic information of the English clickbait corpus. In our paper, we construct a Chinese WeChat clickbait dataset, and propose an effective deep method, i.e., multiple features for WeChat clickbait detection (MFWCD), by integrating semantic, syntactic and auxiliary information. Based on the MFWCD framework, we propose two models with different parameter scales, namely MFWCD-BERT and MFWCD-BiLSTM, which respectively use Bidirectional Encoder Representation from Transformers (BERT) and lightweight Bidirectional Long Short-Term Memory (Bi-LSTM) network with attention mechanism to encode title semantics. In addition, we propose an improved Graph Attention Network (GAT) to aggregate local syntactic structures of titles and use attention mechanism to capture valuable structures. Finally, an auxiliary feature related to user reading behavior is introduced to obtain a richer title representation. Sufficient experiments prove the effectiveness and interpretability of our MFWCD for clickbait detection, and the performance is better than compared baseline methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI