聚类分析
汉明距离
相似性(几何)
计算机科学
对象(语法)
k-中位数聚类
数据挖掘
星团(航天器)
单连锁聚类
选择(遗传算法)
完整的链接聚类
模式识别(心理学)
人工智能
模糊聚类
CURE数据聚类算法
算法
图像(数学)
程序设计语言
作者
Hassan Rezaei,Negin Daneshpour
标识
DOI:10.1016/j.patcog.2023.109815
摘要
Finding the degree of similarity measurement is one of the challenges of mixed data clustering. In this article, it has been tried to design a more efficient method by innovating in three important parts of clustering. In the part of the general method, for assigning data objects to the cluster, in addition to the distance, attention is paid to the "number of similar features". Compared to assigning each object to a cluster, in cases where the distances are equal or close, the cluster center with the highest number of features similar to the given objects will be appropriate. This method is more accurate than the Hamming distance. To determine the cluster centers, instead of random selection, a more suitable object is identified with a distance-based method. In accuracy in three datasets, the proposed algorithm has performed at least two percent better than the other algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI