计算机科学
水准点(测量)
人工智能
机器学习
利用
一致性(知识库)
任务(项目管理)
标记数据
多样性(控制论)
过程(计算)
计算机安全
管理
大地测量学
经济
地理
操作系统
作者
Chengzhe Yuan,Zekai Zhou,Feiyi Tang,Ronghua Lin,Chengjie Mao,Luyao Teng
标识
DOI:10.1007/978-981-99-7254-8_3
摘要
In the Semi-Supervised Text Classification (SSTC) task, the performance of the SSTC-based models heavily rely on the accuracy of the pseudo-labels for unlabeled data, which is not practical in real-world scenarios. Prompt-learning has recently proved to be effective to alleviate the low accuracy problem caused by the limited label data in SSTC. In this paper, we present a Pattern Exploiting Training with Unsupervised Data Augmentation (PETUDA) method to address SSCT under limited labels setting. We first exploit the potential of the PLMs using prompt learning, convert the text classification task into a cloze-style task, and use the masked prediction ability of the PLMs to predict the categories. Then, we use a variety of data augmentation methods to enhance the model performance with unlabeled data, and introduce a consistency loss in the model training process to make full use of unlabeled data. Finally, we conduct extensive experiments on three text classification benchmark datasets. Empirical results show that PETUDA consistently outperforms the baselines in all cases.
科研通智能强力驱动
Strongly Powered by AbleSci AI