|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 93 - Issue 12 |
| Published: May 2014 |
| Authors: Riadh Ouersighni |
10.5120/16269-6001
|
Riadh Ouersighni . Robust Rule-based Approach in Arabic Processing. International Journal of Computer Applications. 93, 12 (May 2014), 31-37. DOI=10.5120/16269-6001
@article{ 10.5120/16269-6001,
author = { Riadh Ouersighni },
title = { Robust Rule-based Approach in Arabic Processing },
journal = { International Journal of Computer Applications },
year = { 2014 },
volume = { 93 },
number = { 12 },
pages = { 31-37 },
doi = { 10.5120/16269-6001 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2014
%A Riadh Ouersighni
%T Robust Rule-based Approach in Arabic Processing%T
%J International Journal of Computer Applications
%V 93
%N 12
%P 31-37
%R 10.5120/16269-6001
%I Foundation of Computer Science (FCS), NY, USA
A parsing system is a key element of many computer applications such as Information Retrieval, Knowledge Extraction and automatic translation. This paper presents a robust large-scale parser system for parsing Arabic sentences. From a practical point of view, the system is able to analyze real-world sentences thanks to a wide coverage of its linguistic knowledge that is realized within the DIINAR-MBC European project . The parser is designed for robustness against difficult input that cannot be parsed correctly according to the standard grammar rules in the system, whether it is an extra-grammatical, ill-formed or unexpected input. Most systems use algorithmic approaches to robustness where parsing programs are extended to include heuristics to handle defect cases. This study adopts another solution based on a robust grammar-based approach for parsing. It consists of introducing robust rules in the grammar itself and relaxing constraints if necessary. The parser has been evaluated against real-world sentences and the results were very encouraging. The parser provides 95% coverage.