Article:A Review on Speech Recognition Technique

Santosh K.Gaikwad; Bharti W.Gawali; Pravin Yannawar

Research Article

Article:A Review on Speech Recognition Technique

by Santosh K.Gaikwad, Bharti W.Gawali, Pravin Yannawar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 10 - Issue 3

Published: November 2010

Authors: Santosh K.Gaikwad, Bharti W.Gawali, Pravin Yannawar

10.5120/1462-1976

PDF

Santosh K.Gaikwad, Bharti W.Gawali, Pravin Yannawar . Article:A Review on Speech Recognition Technique. International Journal of Computer Applications. 10, 3 (November 2010), 16-24. DOI=10.5120/1462-1976

                        @article{ 10.5120/1462-1976,
                        author  = { Santosh K.Gaikwad,Bharti W.Gawali,Pravin Yannawar },
                        title   = { Article:A Review on Speech Recognition Technique },
                        journal = { International Journal of Computer Applications },
                        year    = { 2010 },
                        volume  = { 10 },
                        number  = { 3 },
                        pages   = { 16-24 },
                        doi     = { 10.5120/1462-1976 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2010
                        %A Santosh K.Gaikwad
                        %A Bharti W.Gawali
                        %A Pravin Yannawar
                        %T Article:A Review on Speech Recognition Technique%T 
                        %J International Journal of Computer Applications
                        %V 10
                        %N 3
                        %P 16-24
                        %R 10.5120/1462-1976
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

The Speech is most prominent & primary mode of Communication among of human being. The communication among human computer interaction is called human computer interface. Speech has potential of being important mode of interaction with computer .This paper gives an overview of major technological perspective and appreciation of the fundamental progress of speech recognition and also gives overview technique developed in each stage of speech recognition. This paper helps in choosing the technique along with their relative merits & demerits. A comparative study of different technique is done as per stages. This paper is concludes with the decision on feature direction for developing technique in human computer interface system using Marathi Language.

References

R.Klevansand R.Rodman, “Voice Recognition, Artech House, Boston, London 1997.
Samudravijaya K. Speech and Speaker recognition tutorial TIFR Mumbai 400005.
Kevin Brady, Michael Brandstein, Thomas Quatieri, Bob Dunn “An Evaluation Of Audio-Visual person Recognition on the XM2VTS corpus using the Lausanne protocol” MIT Lincoln Laboratory, 244 Wood St., Lexington MA
W. M. Campbell_, D. E. Sturim W. Shen D. A. Reynolds_, J. Navr´atily “The MIT- LL/IBM Speaker recognition System using High performance reduced Complexity recognition” MIT Lincoln Laboratory IBM 2006.
Zahi N.Karam,William M.Campbell “A new Kernel for SVM MIIR based Speaker recognition “MIT Lincoln Laboratory, Lexington, MA, USA.
Asghar .Taheri ,Mohammad Reza Trihi et.al,Fuzzy Hidden Markov Models for speech recognition on based FEM Algorithm, Transaction on engineering Computing and Technology V4 February 2005,IISN,1305-5313
GIN-DER WU AND YING LEI “ A Register Array based Low power FFT Processor for speech recognition” Department of Electrical engineering national Chi Nan university Puli ,545 Taiwan
Nicolás Morales1, John H. L. Hansen2 and Doorstep T. Toledano1 “MFCC Compensation for improved recognition filtered and band limited speech” Center for Spoken Language Research, University of Colorado at Boulder, Boulder (CO), USA
M.A.Anusuya , S.K.Katti “Speech Recognition by Machine: A Review” International journal of computer science and Information Security 2009.
Goutam Saha, Ulla S. Yadhunandan “ Modifield Mel-Frequency Cepstral coefficient Department of Electronics and Electrical communication Engineering India Institute of Technology ,Kharagpur Kharagpur-721302 West Bengal,India. .
Kenneth Thomas Schutte “Parts-based Models and Local Features for Automatic Speech Recognition” B.S., University of Illinois at Urbana-Champaign (2001) S.M., Massachusetts Institute of Technology (2003).
Zaidi Razak, Noor Jamaliah Ibrahim, Emran Mohd Tamil, Mohd Yamani Idna Idris “Quarnic Verse recitation feature extraction using Mel-Frequency Cepstral Coefficient(MFCC)” Department of Al-Quran & Al-Hadith, AcademyOf Islamic Studies, University of Malaya .
Samudravijay K “Speech and Speaker recognition report” source: http://cs.jounsuu.fi/pages/tkinnu/reaserch/index.html Viewed on 23 Feb. 2010.
Sannella, M Speaker recognition Project Report report” From http://cs.joensuu.fi/pages/tkinnu/research/index.html Viewed 23 Feb. 2010.
IBM (2010) online IBM Research Source:-http://www.research.ibm.com/Viewed 12 Jan 2010.
P.satyanarayana “short segment analysis of speech for enhancement” institute of IIT Madras feb.2009
David, E., and Selfridge, O., Eyes and ears for computers, Proc.IRE 50:1093.
SadokiFuruki,Tomohisa Ichiba et.al,Cluster-based Modeling for Ubiquitous Speech Recognition, Department of Computer Science Tokyo Institute of Technology Interspeech 2005.
Spector, Simon Kinga and Joe Frankel, Recognition ,Speech production knowledge in automatic speech recognition , Journal of Acoustic Society of America,2006
M.A Zissman,”Predicting,diagonosing and improving automatic Language identification performance” ,Proc.Eurospeech97,Sept.1997 vol.1,pp.51-54 1989.
Y.Yan and E.Bernard ,”An apporch to automatic language identification basedon language depandant phone recognition “,ICASSP’95,vol.5,May.1995 p.3511
E. Singer, P.A. Torres-Carrasquillo, T.P. Gleason, W.M. Campbell, and D.A. Reynolds ,“Accoustic ,phonetic and discriminative approach to automic Language Idantification”.
Viet Bac Le, Laurent Besacier, and Tanja Schultz, Acoustic-phonetic unit similarities for context dependant acoustic model portability Carnegie Mellon University, Pittsburgh, PA, USA
C.S.Myers and L.R.Rabiner, A Level Building Dynamic Time Warping Algorithm for Connected Word Recognition , IEEE Trans. Acoustics, Speech Signal Proc.,ASSP-29:284-297,April 1981.
D.R.reddy,An Approach to Computer speech Recognition by direct analysis of the speech wave,Tech.Report No.C549,Computer Science Department ,Stanford University,sept.1996
Tavel R.K.Moore,Twenty things we still don’t know about speech proc.CRIM/FORWISS Workshop on Progress and Prospects of speech Research and Technology 1994.
H.Sakoe and S.Chiba, Dynamic programming algorithm optimization for spoken word recognition ,IEEE Trans. Acoustics, Speech, Signal Proc.,ASSP-26(1).1978
Keh-Yih Su et.al., Speech Recognition using weighted HMM and subspace IEEE Transactions on Audio, Speech and Language.
L.R.Bahl et.al, A method of Construction of acoustic Markov Model for words, IEEE Transaction on Audio ,speech and Language Processing ,Vol.1,1993
Shigeru Katagiri et.al., A New hybrid algorithm for speech recognition based on HMM segmentation and learning Vector quantization , IEEE Transactions on Audio Speech and Language processing Vol.1,No.4
G. 2003 Lalit R .Bahl et.al.,Estimating Hidden Markov Model Parameters so as to maximize speech recognition Accuracy,IEEE Transaction on Audio, Speech and Language Processing Vol.1 No.1 , Jan.1993.
Gerhard Rogoll,Maximum Mutual Information Neural Networks for hybrid connectionist-HMM speech Recognition systems ,IEEE Transaction on Audio, speech and Language Processing Vol.2 ,No.1,Part II,Jan.1994.
Antonio M. Peinado et.al, discriminative codebook design using Multiple Vector quatization in HMM based speech recognizers,IEEE Transaction on Audio,Speech and language Processing Vol.4 No.2 March.1996
Nam Soo kim et.al,On estimating robust Probability Distribution in HMM in HMM based Speech Recognition ,IEEE Transaction on Audio, Speech and Language Processing Vol.3,No.4 ,July 1995.
Jean Francois, Automatic word Recognition Based on Second Order hidden Markov Models.IEEE Transaction on Audio, Speech and Language ProcessingVol.5, No.1, Jan.1997.
Mari ostendorf et.al. from HMM to segment Models: a Unified View stochastic Modeling for speech Recognition ,IEEE Transaction on audio, speech and Language Processing Vol.4,No.5,September 1996.
John butzberger ,Spontaneous speech effects In Large Vocabulary Speech Recognition application,SRI International Speech Research and Technology Program Menlo Park,CA 94025
Dannis Norris, “Merging Information in Speech Recognition” feedback is never Necessary workshop.1995
Yifan gong, stochastic trajectory Modeling and Sentence searching for continuous Speech Recognition,IEEE Transaction on Speech and Audio Processing,1997.
Alex weibel and Kai-Fu Lee, reading in Speech recognition ,Morgan Kaufman Publisher,Inc.San Mateo,California,1990.
John Butzberger, Spontanious Speech Effect in Large Vocublary speech recognition application, SRI International Speech Research and Technology program Menlo Park, CA94025.
M.J.F.Gales and S.J young, Parallel Model combination for Speech Recognition in Noise technical Report, CUED/FINEFENG/TRI135, 1993.
A.P.Varga and R.K.Moore, “Hidden Markov Model Decomposition of Speech and Noise, Proc.ICASSp, pp.845-848, 1990.
M.Weintraub et.al, linguistic constraints in hidden markov Model based speech recognition, Proc.ICASSP, pp.699-702, 1989.
S.katagiri, Speech Pattern recognition using Neural Networks.
L.R.Rabiner and B.H.jaung ,” Fundamentles of Speech Recognition Prentice-Hall, Englewood Cliff, New Jersy, 1993
D.R.Reddy, An Approach to Computer Speech Recognition by Direct Analysis of the Speech Wave , Tech.Report No.C549, Computer Science Dept., Stanford Univ., September 1966
K.Nagata, Y.Kato, and S.Chiba, Spoken Digit Recognizer for Japanese Language , NEC Res.Develop., No.6,1963
D.B.Fry, Theoretical Aspects of Mechanical speech Recognition , and P.Denes, The design and Operation of the Mechanical Speech Recognizer at University College London, J.British Inst. Radio Engr., 19:4,211-299,1959.
Dat Tat Tran, Fuzzy Approaches to Speech and Speaker Recognition , A thesis submitted for the degree of Doctor of Philosophy of the university of Canberra
Lawrence Rabiner, Biing Hwang Juang, Fundamental of Speech Recognition, Copyright 1999by AT&T.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Analysis feature extraction Modeling Testing speech processing HCI