|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 75 - Issue 3 |
| Published: August 2013 |
| Authors: Mohammed M. Abu Tair, Rebhi S. Baraka |
10.5120/13090-0370
|
Mohammed M. Abu Tair, Rebhi S. Baraka . Design and Evaluation of a Parallel Classifier for Large-Scale Arabic Text. International Journal of Computer Applications. 75, 3 (August 2013), 13-20. DOI=10.5120/13090-0370
@article{ 10.5120/13090-0370,
author = { Mohammed M. Abu Tair,Rebhi S. Baraka },
title = { Design and Evaluation of a Parallel Classifier for Large-Scale Arabic Text },
journal = { International Journal of Computer Applications },
year = { 2013 },
volume = { 75 },
number = { 3 },
pages = { 13-20 },
doi = { 10.5120/13090-0370 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2013
%A Mohammed M. Abu Tair
%A Rebhi S. Baraka
%T Design and Evaluation of a Parallel Classifier for Large-Scale Arabic Text%T
%J International Journal of Computer Applications
%V 75
%N 3
%P 13-20
%R 10.5120/13090-0370
%I Foundation of Computer Science (FCS), NY, USA
Text classification has become one of the most important techniques in text mining. A number of machine learning algorithms have been introduced to deal with automatic text classification. One of the common classification algorithms is the k-NN algorithm which is known to be one of the best classifiers applied for different languages including Arabic language. However, the k-NN algorithm is of low efficiency because it requires a large amount of computational power. Such a drawback makes it unsuitable to handle a large volume of text documents with high dimensionality and in particular in the Arabic language. This paper introduces a high performance parallel classifier for large-scale Arabic text that achieves the enhanced level of speedup, scalability, and accuracy. The parallel classifier is based on the sequential k-NN algorithm. The classifier has been tested using the OSAC corpus. The performance of the parallel classifier has been studied on a multicomputer cluster. The results indicate that the parallel classifier has very good speedup and scalability and is capable of handling large documents collections with higher classification results.