International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 59 - Issue 20 |
Published: December 2012 |
Authors: D. Sasirekha, E. Chandra |
![]() |
D. Sasirekha, E. Chandra . Text Recognition from PDF Files using BPNN and SVM. International Journal of Computer Applications. 59, 20 (December 2012), 18-22. DOI=10.5120/9819-4417
@article{ 10.5120/9819-4417, author = { D. Sasirekha,E. Chandra }, title = { Text Recognition from PDF Files using BPNN and SVM }, journal = { International Journal of Computer Applications }, year = { 2012 }, volume = { 59 }, number = { 20 }, pages = { 18-22 }, doi = { 10.5120/9819-4417 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2012 %A D. Sasirekha %A E. Chandra %T Text Recognition from PDF Files using BPNN and SVM%T %J International Journal of Computer Applications %V 59 %N 20 %P 18-22 %R 10.5120/9819-4417 %I Foundation of Computer Science (FCS), NY, USA
OCR, is the process of electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. OCR systems are given additional consideration nowadays. The PDF files consist of text, images and graphs. Mixed Raster Content (MRC) technique segregates text and non-text region from the PDF files and the text part alone is extracted. Artificial Neural Networks (ANN) is a standard pattern classifier and extensively applicable to various problems and here uses Backpropagation learning algorithm which is very usable for image processing. SVM is a classifier that performs classification to find an optimal solution. Thus, this research uses the BPNN and SVM method for OCR from the extracted text files using features. 100 different format of PDF files have been tested and the experimental results with recognition performance are tabulated by comparing both the techniques