A novel method for K-Means clustering algorithm

Jinguo Zhao

School of Computer and Information Science, Hunan Institute of Technology, Hunan, 421002, China

This paper investigated K-means algorithm, a well-known clustering algorithm. K-means clustering algorithms have some shortfalls and defects, and one defect is reviewed in this study. One of the disadvantages of K-means clustering algorithms is that they can produce clusters that do not always include all the correct components. It is due to the presence of the error rate during the clustering process. The purpose of this research was to decrease error rates in the K-means clustering algorithm and to reduce iteration of running this algorithm. A novel method is proposed to calculate the distance between cluster members and cluster centre. To evaluate the algorithm proposed in this study, seven well-known data sets consisting of Balance, Blood, Breast, Glass, Iris, Pima and Wine data sets were used. This investigation revealed that the performance of K-means algorithms was increased and resulted in valid clusters and that it reduced error rates, run time and iteration.

Editor-in-Chief:	Prof. Viktors Gopejenko
E-mail:	viktors.gopejenko@inbox.lv