Tibetan Text Classification based on Prompt Learning and Ensemble Learning
集成学习
计算机科学
人工智能
机器学习
作者
Chao Tang,Zheng-Hua Tan,Xiaobing Zhao
出处
期刊:ACM Transactions on Asian and Low-Resource Language Information Processing日期:2025-01-21
标识
DOI:10.1145/3711827
摘要
With the advancement of pre-trained language models, prompt learning has emerged as a trend for text classification. It offers several advantages over traditional machine learning methods, particularly for low-resource natural language processing tasks. Prompt learning enables fine-tuning of pre-trained language models on relatively small datasets, eliminating the need for a large number of expensive labeled samples. This paper proposes an effective approach that combines prompt learning and ensemble learning, aiming to enhance the performance of individual language models in Tibetan text classification tasks. This approach introduces a different perspective by transforming the traditional text classification problem into an entailment relationship exploration. Instead of directly assigning categories to sentences, models judge the sentence category based on whether it is entailed by a given prompt. This innovative method enables us to leverage the strengths of prompt learning and ensemble learning simultaneously, resulting in improved classification accuracy. To evaluate the effectiveness of our approach, this paper conduct extensive experiments on two public datasets, TNCC and WCM. The experimental results demonstrate the effectiveness of our method, achieving weighted F1 scores of 72.72% and 78% on TNCC and WCM, respectively. Compared to traditional machine learning methods, this prompt learning approach exhibits a significant 10% performance gain in low-resource natural language processing tasks.