Noisy Speech Recognition by Mel-LPC based AR-HMM with Power and Time Derivative Parameters

M. Babul Islam

Research Article

Noisy Speech Recognition by Mel-LPC based AR-HMM with Power and Time Derivative Parameters

by M. Babul Islam

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 180 - Issue 42

Published: May 2018

Authors: M. Babul Islam

10.5120/ijca2018917149

PDF

M. Babul Islam . Noisy Speech Recognition by Mel-LPC based AR-HMM with Power and Time Derivative Parameters. International Journal of Computer Applications. 180, 42 (May 2018), 1-5. DOI=10.5120/ijca2018917149

                        @article{ 10.5120/ijca2018917149,
                        author  = { M. Babul Islam },
                        title   = { Noisy Speech Recognition by Mel-LPC based AR-HMM with Power and Time Derivative Parameters },
                        journal = { International Journal of Computer Applications },
                        year    = { 2018 },
                        volume  = { 180 },
                        number  = { 42 },
                        pages   = { 1-5 },
                        doi     = { 10.5120/ijca2018917149 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2018
                        %A M. Babul Islam
                        %T Noisy Speech Recognition by Mel-LPC based AR-HMM with Power and Time Derivative Parameters%T 
                        %J International Journal of Computer Applications
                        %V 180
                        %N 42
                        %P 1-5
                        %R 10.5120/ijca2018917149
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

In this paper, AR-HMM on mel-scale with power and Mel-LPC based time derivative parameters has been presented for noisy speech recognition. The mel-scaled AR coefficients and melprediction coefficients for Mel-LPC have been calculated on the linear frequency scale from the speech signal without applying bilinear transformation. This has been done by using a first-order allpass filter instead of unit delay. In addition, Mel-Wiener filter has been applied to the system to improve the recognition accuracy in presence of additive noise. The proposed system is evaluated on Aurora 2 database, and the overall recognition accuracy has been found to be 80.02% on the average.

References

Juang, B. and Rabiner, L. R. 1986. Mixture autoregressive hidden Markov models for speech signals. IEEE Trans. Acoust., Speech, Signal Processing, 33: 1404-1413.
Ephraim, Y. 1992. Gain adapted hidden Markov models for recognition of clean and noisy speech. IEEE Trans. Signal Processing, 40(6): 1303-1316.
Ruske, G. and Lee, K. Y. 1999. Speech recognition and enhancement by a nonstationary AR HMM with gain adaptation under unknown noise. Proceedings ICASSP’99.
Deng, L. 1992. A generalized hidden Markov model with state conditioned trend functions of time for speech signal. Signal Processing, 27: 65-72.
Lee, K. Y. and Lee, J. 2001. Recognition of noisy speech by a nonstationary AR HMM with gain adaptation under unknown noise. IEEE Trans. Speech and Audio Processing, l 9(7): 741- 746.
Logan, B. T. and Robinson, A. J. 1997. Improving autoregressive hidden Markov model recognition accuracy using a nonlinear frequency scale with application to speech enhancement. Proc. of EUROSPEECH, 2103-2106.
Juang, B. 1984. On the hidden Markov model and dynamic time warping for speech recognition - a unified view. AT&T Bell Lab. Tec. Journal, 63(7): 1213-1243.
Strube, H. W. 1980. Linear prediction on a warped frequency scale. J. Acoust. Soc. America, 68(4): 1071-1076.
Matsumoto, H., et al. 1998. An efficient Mel- LPC analysis method for speech recognition. Proc. of ICSLP98: 1051- 1054.
Islam, M. B., et al. 2007. Mel-Wiener filter for Mel-LPC based speech recognition. IEICE Transactions on Information and Systems, E90-D (6): 935-942.
Oppenheim, A. V. and Johnson, D. H. 1972. Discrete representation of signals. IEEE Proc., 60(6): 681-691.
Hirsch, H. G. and Pearce, D. 2000. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. Proc. ISCA ITRW ASR2000: 181-188.
Leonard, R. G. 1984. A database for speaker independent digit recognition. ICASSP84, 3: 42.11.1-42.11.4.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

AR-HMM Mel-LPC Mel-Wiener filter Aurora 2 database