Research Article

A Comparative Study of Phoneme Recognition using GMM-HMM and ANN based Acoustic Modeling

by  Farheen Fauziya, Geeta Nijhawan
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 98 - Issue 6
Published: July 2014
Authors: Farheen Fauziya, Geeta Nijhawan
10.5120/17186-7366
PDF

Farheen Fauziya, Geeta Nijhawan . A Comparative Study of Phoneme Recognition using GMM-HMM and ANN based Acoustic Modeling. International Journal of Computer Applications. 98, 6 (July 2014), 12-16. DOI=10.5120/17186-7366

                        @article{ 10.5120/17186-7366,
                        author  = { Farheen Fauziya,Geeta Nijhawan },
                        title   = { A Comparative Study of Phoneme Recognition using GMM-HMM and ANN based Acoustic Modeling },
                        journal = { International Journal of Computer Applications },
                        year    = { 2014 },
                        volume  = { 98 },
                        number  = { 6 },
                        pages   = { 12-16 },
                        doi     = { 10.5120/17186-7366 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2014
                        %A Farheen Fauziya
                        %A Geeta Nijhawan
                        %T A Comparative Study of Phoneme Recognition using GMM-HMM and ANN based Acoustic Modeling%T 
                        %J International Journal of Computer Applications
                        %V 98
                        %N 6
                        %P 12-16
                        %R 10.5120/17186-7366
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Phoneme is the smallest analogous unit of sound employed to form meaningful contrast between utterances. Hidden Markov Model (HMM), Gaussian Mixture model (GMM) and Artificial Neural Network (ANN) have been used in this paper to measure the accuracy and performance of recognition system using toolkits HTK, Sphinx3 and Quicknet, which are freely available for academic works. In this paper the performance of an ASR System based on Accuracy has been compared with TIMIT database.

References
  • S. B. Davis and P Mermelstein. "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions Acoustics Speech and Signal Processing", 28:357–366, 1980.
  • Deller J. R. , Hansen J. H. L. & Proakis J. G. : Discrete-Time Processing of Speech Signals. IEEE Press, 2000
  • H. Bourlard and C. J. Wellekens, "Links between Markov models and multilayer Perceptron" IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 12, no. 12, pp. 1167– 1178,1990. Forman, G. 2003.
  • L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of IEEE, 1989
  • Gold, B. , Morgan, N. : Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley, New York (2000)
  • Steven,S. Volkmann,J. and Newmann,E. , "A scale for Measurement of Psychological Magnitude Pitch. " Journal of the Acoustical Society of America 8:185-190,1935
  • Utpal Bhattacharjee, "A comparative study of LPCC and MFCC feature for the recognition of Assamese phonemes"IJERT,ISSN:2278-0181,Vol. 2 issue 1,January-2013
  • Anant G. Veeravalli,W. D. pan,Reza Adhami,paul G Cox,"Phoneme recognition using Hidden Markov Model,Huntsville Simulation conference,October 2004
  • International computer science Institute, Speech Group, http://www. icsi. berkeley. edu/groups/speech,
  • An introduction to Hidden Makov Model,"L. R. Rabiner & B. H. Juang"IEEE ASSP MAGAZINE JANUARY 1986
  • S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, P. Woodland. The HTK Book. Revised for HTK Version 3. 2 Dec. 2002. http://htk. eng. cam. ac. uk/
  • Phil Blunsom, "Hidden Markov Models", August 19, 2004
  • P Zolfaghari and A. J. Robinson. Formant analysis using mixtures of Gaussians. In ProceeSLP, pages 904-907,1996
  • Anjan Basu and Torbjgrn Svendsen ,"A Time-Frequency Segmental Neural Networks for Phoneme Recognition "Acoustic Speech and Signal processing, IEEE 1993.
  • Zaihu PANG et al. Discriminative training of GMM-HMM acoustic model by RPCL learning, Front. Electr. Electron. Eng. China 2011, 6(2): 283–290
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Automatic Speech Recognition MFCC Hidden Markov Model

Powered by PhDFocusTM