A fuzzy C-means algorithm for optimizing data clustering

计算机科学 聚类分析 树冠聚类算法 CURE数据聚类算法 模糊聚类 数据挖掘 初始化 大数据 相关聚类 正确性 数据流聚类 人工智能 算法 程序设计语言
作者
Seyed Emadedin Hashemi,Fatemeh Gholian-Jouybari,Mostafa Hajiaghaei–Keshteli
出处
期刊:Expert Systems With Applications [Elsevier BV]
卷期号:227: 120377-120377 被引量:78
标识
DOI:10.1016/j.eswa.2023.120377
摘要

Big data has increasingly become predominant in many research fields affecting human knowledge, including medicine and engineering. Cluster analysis, or clustering, is widely recognized as one of the most effective processes to deal with various types of data, especially big data. There has been considerable interest in Fuzzy C-Means (FCM) as a method for clustering data using a short-distance approach in data mining. However, despite its simplicity, this method is not suitable for clustering large data sets due to their complex structure. In particular, FCM is sensitive to cluster center initialization, and an improper initialization can result in slow or non-optimal convergence. In order to solve the FCM convergence problem and find more appropriate cluster centers, optimization methods are generally used. In this study, a whale optimization algorithm is applied to solve the problem. As a solution to the problem of big data clustering, random sampling, clustering on samples, and extending the clustering results to all data are proposed. The proposed algorithm is implemented on several large data sets, both artificial and real, with many features after normalization and standardization. To verify the validity and correctness of the performance of the proposed algorithm, the same data sets have been clustered with other known algorithms, and the results compared using several valid fuzzy indices. Based on the comparison results, it can be concluded that the proposed algorithm is more powerful and efficient than other algorithms and, hence, can be used to effectively cluster large data sets. Our study can benefit organizations and managers who have a large amount of data and are unable to classify or make use of them properly. Using big data takes a lot of time. The features of the proposed algorithm would be of great help to managers allowing them to make better decisions and improve the quality of their work.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
忧虑的土豆完成签到,获得积分20
刚刚
科研通AI2S应助合适的蛋挞采纳,获得10
1秒前
小新关注了科研通微信公众号
2秒前
WMT完成签到 ,获得积分10
3秒前
4秒前
李金奥发布了新的文献求助10
4秒前
5秒前
景清完成签到 ,获得积分10
6秒前
carejd完成签到 ,获得积分10
7秒前
8秒前
Layla完成签到,获得积分10
9秒前
小七完成签到,获得积分20
10秒前
sundial完成签到,获得积分10
11秒前
11秒前
12秒前
单薄树叶完成签到,获得积分10
13秒前
leftarrow完成签到,获得积分10
13秒前
13秒前
momo625完成签到,获得积分10
14秒前
carejd关注了科研通微信公众号
14秒前
kkkk发布了新的文献求助10
15秒前
15秒前
???完成签到,获得积分10
16秒前
HarrisonChan完成签到,获得积分10
16秒前
16秒前
曹梦梦完成签到,获得积分10
16秒前
Owen应助叶子麻采纳,获得10
16秒前
茶泡饭完成签到,获得积分10
18秒前
12341发布了新的文献求助10
18秒前
SHC关闭了SHC文献求助
21秒前
21秒前
张天宝真的爱科研完成签到,获得积分10
22秒前
善学以致用应助best采纳,获得10
23秒前
24秒前
davedavedave完成签到 ,获得积分10
25秒前
蟹老板完成签到,获得积分10
27秒前
周旭发布了新的文献求助10
27秒前
大知闲闲完成签到 ,获得积分10
27秒前
27秒前
27秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Lewis’s Child and Adolescent Psychiatry: A Comprehensive Textbook Sixth Edition 2000
Cronologia da história de Macau 1600
Continuing Syntax 1000
Current concept for improving treatment of prostate cancer based on combination of LH-RH agonists with other agents 1000
Encyclopedia of Quaternary Science Reference Work • Third edition • 2025 800
Influence of graphite content on the tribological behavior of copper matrix composites 698
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 计算机科学 化学工程 生物化学 物理 复合材料 内科学 催化作用 物理化学 光电子学 细胞生物学 基因 电极 遗传学
热门帖子
关注 科研通微信公众号,转发送积分 6213386
求助须知:如何正确求助?哪些是违规求助? 8039119
关于积分的说明 16752098
捐赠科研通 5301926
什么是DOI,文献DOI怎么找? 2824799
邀请新用户注册赠送积分活动 1803254
关于科研通互助平台的介绍 1663865