计算机科学
词汇分析
情绪分析
自然语言处理
人工智能
编码器
停止语
俚语
词典序
预处理器
语言学
哲学
数学
组合数学
操作系统
作者
Gokul Yenduri,B. R. Rajakumar,K. Praghash,D. Binu
标识
DOI:10.1142/s1469026821500152
摘要
The identification of opinions and sentiments from tweets is termed as “Twitter Sentiment Analysis (TSA)”. The major process of TSA is to determine the sentiment or polarity of the tweet and then classifying them into a negative or positive tweet. There are several methods introduced for carrying out TSA, however, it remains to be challenging due to slang words, modern accents, grammatical and spelling mistakes, and other issues that could not be solved by existing techniques. This work develops a novel customized BERT-oriented sentiment classification that encompasses two main phases: pre-processing and tokenization, and a “Customized Bidirectional Encoder Representations from Transformers (BERT)”-based classification. At first, the gathered raw tweets are pre-processed under stop-word removal, stemming and blank space removal. After pre-processing, the semantic words are obtained, from which the meaningful words (tokens) are extracted in the tokenization phase. Consequently, these extracted tokens are classified via optimized BERT, where biases and weight are tuned optimally by Particle-Assisted Circle Updating Position (PA-CUP). Moreover, the maximal sequence length of the BERT encoder is updated using standard PA-CUP. Finally, the performance analysis is carried out to substantiate the enhancement of the proposed model.
科研通智能强力驱动
Strongly Powered by AbleSci AI