International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 42 - Issue 12 |
Published: March 2012 |
Authors: T. B. Adam, Md Salam |
![]() |
T. B. Adam, Md Salam . Spoken English Alphabet Recognition with Mel Frequency Cepstral Coefficients and Back Propagation Neural Networks. International Journal of Computer Applications. 42, 12 (March 2012), 21-27. DOI=10.5120/5745-7946
@article{ 10.5120/5745-7946, author = { T. B. Adam,Md Salam }, title = { Spoken English Alphabet Recognition with Mel Frequency Cepstral Coefficients and Back Propagation Neural Networks }, journal = { International Journal of Computer Applications }, year = { 2012 }, volume = { 42 }, number = { 12 }, pages = { 21-27 }, doi = { 10.5120/5745-7946 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2012 %A T. B. Adam %A Md Salam %T Spoken English Alphabet Recognition with Mel Frequency Cepstral Coefficients and Back Propagation Neural Networks%T %J International Journal of Computer Applications %V 42 %N 12 %P 21-27 %R 10.5120/5745-7946 %I Foundation of Computer Science (FCS), NY, USA
Spoken alphabet recognition as one of the subsets of speechrecognition and pattern recognition has many applications. Unfortunately, spoken alphabet recognition might not be a simple task due to highly confusable set of letters as presented in the English alphabets. The highly acoustic similarities that contribute to the confusability may hinder the accuracy of speech recognition systems. One of the confusable set is called the E-set letters which consist of the letters B, C, D, E, G, P, T, V and Z. In this study, we present aninvestigation of isolated alphabet speech recognition system using the Mel Frequency Cepstral Coefficients (MFCC) and Back-propagation Neural Network (BPNN) for the E-set and for all the 26 English alphabets. Learning rates and momentum rates of the BPNN are adjusted and varied in order to achieve the best recognition rate for the E-set and all the 26 alphabets. By adjusting these parameters,we managed to achieve 62. 28% and 70. 49% recognition rate for E-set recognition under speaker-independent and speaker-dependent conditions respectively.