基线(sea)
水准点(测量)
计算机科学
人工智能
自然语言处理
机器学习
大地测量学
海洋学
地质学
地理
作者
Insik Chung,Chuan-Jie Lin
标识
DOI:10.1109/iri51335.2021.00069
摘要
This paper introduced TOCAB, a larger dataset for Chinese abusive language detection and classification. This dataset contains 121,344 real sentences collected from a social media site. Several baseline systems built by machine learning or deep learning were proposed to test this benchmark. BERT is the best baseline system which achieves F1-scores of 0.886 in detection and 0.781 in classification. The bootstrap aggregating BERT model, a state-of-the-art system, outperforms our BERT baseline system, with F1-scores of 0.893 in detection and 0.782 in classification.
科研通智能强力驱动
Strongly Powered by AbleSci AI