|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 61 - Issue 18 |
| Published: January 2013 |
| Authors: Shailendra Kumar Shrivastava, J. L. Rana, R. C. Jain |
10.5120/10032-5077
|
Shailendra Kumar Shrivastava, J. L. Rana, R. C. Jain . Text Document Clustering based on Phrase Similarity using Affinity Propagation. International Journal of Computer Applications. 61, 18 (January 2013), 38-44. DOI=10.5120/10032-5077
@article{ 10.5120/10032-5077,
author = { Shailendra Kumar Shrivastava,J. L. Rana,R. C. Jain },
title = { Text Document Clustering based on Phrase Similarity using Affinity Propagation },
journal = { International Journal of Computer Applications },
year = { 2013 },
volume = { 61 },
number = { 18 },
pages = { 38-44 },
doi = { 10.5120/10032-5077 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2013
%A Shailendra Kumar Shrivastava
%A J. L. Rana
%A R. C. Jain
%T Text Document Clustering based on Phrase Similarity using Affinity Propagation%T
%J International Journal of Computer Applications
%V 61
%N 18
%P 38-44
%R 10.5120/10032-5077
%I Foundation of Computer Science (FCS), NY, USA
Affinity propagation (AP) was recently introduced as an un-supervised learning algorithm for exemplar based clustering. In this paper novel text document clustering algorithm has been developed based on vector space model, phrases and affinity propagation clustering algorithm. Proposed algorithm can be called Phrase affinity clustering (PAC). PAC first finds the phrase by ukkonen suffix tree construction algorithm, second finds the vector space model using tf-idf weighting scheme of phrase. Third calculate the similarity matrix form VSD using cosine similarity . In Last affinity propagation algorithm generate the clusters . F-Measure ,Purity and Entropy of Proposed algorithm is better than GAHC ,ST-GAHC and ST-KNN on OHSUMED ,RCV1 and News group data sets.