International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 153 - Issue 3 |
Published: Nov 2016 |
Authors: Anju, Preeti Gulia |
![]() |
Anju, Preeti Gulia . Clustering in Big Data: A Review. International Journal of Computer Applications. 153, 3 (Nov 2016), 44-47. DOI=10.5120/ijca2016911994
@article{ 10.5120/ijca2016911994, author = { Anju,Preeti Gulia }, title = { Clustering in Big Data: A Review }, journal = { International Journal of Computer Applications }, year = { 2016 }, volume = { 153 }, number = { 3 }, pages = { 44-47 }, doi = { 10.5120/ijca2016911994 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2016 %A Anju %A Preeti Gulia %T Clustering in Big Data: A Review%T %J International Journal of Computer Applications %V 153 %N 3 %P 44-47 %R 10.5120/ijca2016911994 %I Foundation of Computer Science (FCS), NY, USA
BIG DATA[1] is a term for data sets that are so large or complex that traditional data processing[4] applications are inadequate. Accuracy in big data may lead to more confident decision making, and better decisions can result in greater operational efficiency, cost reduction and reduced risk. Various algorithms and techniques like Classification, Clustering, Regression, Artificial Intelligence, Neural Networks, Association Rules, Decision Trees, Genetic Algorithm, Nearest Neighbor method are used for knowledge discovery from databases. Cluster is a group of objects that belongs to the same class. In other words, similar objects are grouped in one cluster and dissimilar objects are group in another cluster. Clustering methods can be classified into Partitioning Method, Hierarchical Method, Density-based Method. Clustering analysis is used in several applications like market research, pattern recognition, data analysis. K-means clustering is well known partitioning method. But this method has problem of empty cluster. The problems with existing system[6] were analysis, capture, search, sharing, storage, transfer, visualization, querying-updating. These problems can be reduced by using proposed algorithm. In this paper clustering and proposed algorithm is discussed.