International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 75 - Issue 3 |
Published: August 2013 |
Authors: Mohammed M. Abu Tair, Rebhi S. Baraka |
![]() |
Mohammed M. Abu Tair, Rebhi S. Baraka . Design and Evaluation of a Parallel Classifier for Large-Scale Arabic Text. International Journal of Computer Applications. 75, 3 (August 2013), 13-20. DOI=10.5120/13090-0370
@article{ 10.5120/13090-0370, author = { Mohammed M. Abu Tair,Rebhi S. Baraka }, title = { Design and Evaluation of a Parallel Classifier for Large-Scale Arabic Text }, journal = { International Journal of Computer Applications }, year = { 2013 }, volume = { 75 }, number = { 3 }, pages = { 13-20 }, doi = { 10.5120/13090-0370 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2013 %A Mohammed M. Abu Tair %A Rebhi S. Baraka %T Design and Evaluation of a Parallel Classifier for Large-Scale Arabic Text%T %J International Journal of Computer Applications %V 75 %N 3 %P 13-20 %R 10.5120/13090-0370 %I Foundation of Computer Science (FCS), NY, USA
Text classification has become one of the most important techniques in text mining. A number of machine learning algorithms have been introduced to deal with automatic text classification. One of the common classification algorithms is the k-NN algorithm which is known to be one of the best classifiers applied for different languages including Arabic language. However, the k-NN algorithm is of low efficiency because it requires a large amount of computational power. Such a drawback makes it unsuitable to handle a large volume of text documents with high dimensionality and in particular in the Arabic language. This paper introduces a high performance parallel classifier for large-scale Arabic text that achieves the enhanced level of speedup, scalability, and accuracy. The parallel classifier is based on the sequential k-NN algorithm. The classifier has been tested using the OSAC corpus. The performance of the parallel classifier has been studied on a multicomputer cluster. The results indicate that the parallel classifier has very good speedup and scalability and is capable of handling large documents collections with higher classification results.