分类
计算机科学
人工智能
特征选择
自然语言处理
预处理器
集合(抽象数据类型)
支持向量机
特征(语言学)
文本分类
情报检索
情绪分析
文本挖掘
粗集
语言学
哲学
程序设计语言
作者
Saif Ali Abd Alradha Alsaidi,Ahmed T. Sadiq,Hasanen S. Abdullah
出处
期刊:Bulletin of Electrical Engineering and Informatics
[Institute of Advanced Engineering and Science]
日期:2020-08-01
卷期号:9 (4): 1701-1710
被引量:12
标识
DOI:10.11591/eei.v9i4.1898
摘要
In recent years, Text Mining wasan important topic because of the growth of digital text data from many sources such as government document, Email, Social Media, Website, etc. The English poemsare one of the text data to categorization English Poems will use Text categorization, Text categorization is a method in which classify documents into one or more categories that were predefined the category based on the text content in a document .In this paper we will solve the problem of how to categorize the English poem into one of the English Poems categorizations by using text mining technique and Machine learning algorithm, Our data set consist of seven categorizations for poems the data set is divided into two-part training (learning)and testing data. In the proposed model we apply the text preprocessing for the documents file to reduce the number of feature and reduce dimensionality the preprocessing process converts the text poem to features and remove the irrelevant feature by using text mining process (tokenize,remove stop word and stemming), to reduce the feature vector of the remaining feature we usetwo methods for feature selection and use Rough set theory as machine learning algorithm to perform the categorization, and we get 88% success classification of the proposed model.
科研通智能强力驱动
Strongly Powered by AbleSci AI