Research Article

Effect of Distance Functions on Simple K-means Clustering Algorithm

by  Richa Loohach, Kanwal Garg
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 49 - Issue 6
Published: July 2012
Authors: Richa Loohach, Kanwal Garg
10.5120/7629-0698
PDF

Richa Loohach, Kanwal Garg . Effect of Distance Functions on Simple K-means Clustering Algorithm. International Journal of Computer Applications. 49, 6 (July 2012), 7-9. DOI=10.5120/7629-0698

                        @article{ 10.5120/7629-0698,
                        author  = { Richa Loohach,Kanwal Garg },
                        title   = { Effect of Distance Functions on Simple K-means Clustering Algorithm },
                        journal = { International Journal of Computer Applications },
                        year    = { 2012 },
                        volume  = { 49 },
                        number  = { 6 },
                        pages   = { 7-9 },
                        doi     = { 10.5120/7629-0698 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2012
                        %A Richa Loohach
                        %A Kanwal Garg
                        %T Effect of Distance Functions on Simple K-means Clustering Algorithm%T 
                        %J International Journal of Computer Applications
                        %V 49
                        %N 6
                        %P 7-9
                        %R 10.5120/7629-0698
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering analysis is the most significant step in data mining. This paper discusses the k-means clustering algorithm and various distance functions used in k-means clustering algorithm such as Euclidean distance function and Manhattan distance function. Experimental results are shown to observe the effect of Manhattan distance function and Euclidean distance function on k-means clustering algorithm. These results also show that distance functions furthermore affect the size of clusters formed by the k-means clustering algorithm.

References
  • Shi Na , Liu Xumin and Guan yong 2010 "Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm", Third International Symposium on Intelligent Information Technology and Security Informatics 978-0-7695-4020-7/10 $26. 00 © 2010 IEEE
  • A. K. Jain, M. N. Murty and P. J. Flynn 1999, "Data Clustering: A Review", ACM Computing Surveys, Vol. 31, No. 3, September 1999.
  • Source: collection of regression datasets by Luis Torgo (ltorgo@ncc. up. pt) at http://www. ncc. up. pt/~ltorgo/Regression/DataSets. html
  • D. Randall Wilson and Tony R. Martinez 1997 "Improved Heterogeneous Distance Functions" Journal of Artificial Intelligence Research 6 (1997) 1-34 Submitted 5/96; published 1/97 © 1997 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved
  • Antoni Moore 2002 "The case for approximate Distance Transforms" Presented at SIRC 2002 – The 14th Annual Colloquium of the Spatial Information Research Centre University of Otago, Dunedin, New Zealand December 3-5th 2002
  • Glenn Fung 2001, "A Comprehensive Overview of Basic Clustering Algorithms" June 22, 2001
  • Michael Steinbach , Levent Ertöz and Vipin Kumar, "The Challenges of Clustering High Dimensional Data", Access to computing facilities was provided by AHPCRC and the Minnesota Supercomputing Institute.
  • Pavel Berkhin, " Survey of Clustering Data Mining Techniques", Accrue Software, Inc. Author's Address: Pavel Berkhin, Accrue Software, 1045 Forest Knoll Dr. , San Josh, CA, 95129; e-mail: pavelb@accrue. com
  • Juanying Xie, Shuai Jiang 2010, "A simple and fast algorithm for global K-means clustering", 2010 Second International Workshop on Education Technology and Computer Science, 978-0-7695-3987-4/10 $26. 00 © 2010 IEEE DOI 10. 1109/ETCS. 2010. 347
  • Ren Jingbiao,Yin Shaohong 2010 "Research and Improvement of Clustering Algorithm in Data Mining", 2010 2nd International Conference on Signal Processing Systems (ICSPS) 978-1-4244-6893-5/$26. 00 C 2010 IEEE
  • H. G. Wilson, B. Boots, and A. A. Millward 2002, "A Comparison of Hierarchical and Partitional Clustering Techniques for Multispectral Image Classification", 0-7803-7536-X/$17. 00 (C) 2002 IEEE
  • Tung-Shou Chen, Tzu-Hsin Tsai, Yi-Tzu Chen, Chin-Chiang Lin, Rong-Chang Chen, Shuan-Yow Li and Hsin-Yi Chen 2005, "A Combined K-Means And Hierarchical Clustering Method For Improving The Clustering Efficiency Of Microarray", Proceedings of 2005 International Symposium on Intelligent Signal Processing and Communication Systems December 13-16, 2005 Hong Kong, 0-7803-9266-3/05/$20. 00 ©2005 IEEE
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

K-means clustering distance functions clustering Euclidean distance function Manhattan distance function

Powered by PhDFocusTM