Voice Recognition for Gujarati Dialects: An in-depth Survey

Meera M. Shah; Hiren R. Kavathiya

Research Article

Voice Recognition for Gujarati Dialects: An in-depth Survey

by Meera M. Shah, Hiren R. Kavathiya

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 186 - Issue 5

Published: Jan 2024

Authors: Meera M. Shah, Hiren R. Kavathiya

10.5120/ijca2024923112

PDF

Meera M. Shah, Hiren R. Kavathiya . Voice Recognition for Gujarati Dialects: An in-depth Survey. International Journal of Computer Applications. 186, 5 (Jan 2024), 1-4. DOI=10.5120/ijca2024923112

                        @article{ 10.5120/ijca2024923112,
                        author  = { Meera M. Shah,Hiren R. Kavathiya },
                        title   = { Voice Recognition for Gujarati Dialects: An in-depth Survey },
                        journal = { International Journal of Computer Applications },
                        year    = { 2024 },
                        volume  = { 186 },
                        number  = { 5 },
                        pages   = { 1-4 },
                        doi     = { 10.5120/ijca2024923112 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2024
                        %A Meera M. Shah
                        %A Hiren R. Kavathiya
                        %T Voice Recognition for Gujarati Dialects: An in-depth Survey%T 
                        %J International Journal of Computer Applications
                        %V 186
                        %N 5
                        %P 1-4
                        %R 10.5120/ijca2024923112
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

Voice recognition technology nowadays is gaining so much importance, and plenty of work has been done on it for different languages like English, Arabic, Hindi, Chinese, etc. But when we talk about a language like Gujarati, we find a particular lack of work. In this paper, we examined the process of voice recognition in Gujarati. The systematic literature review for voice recognition has been shown here. This paper mainly focuses on the problems that can be found in voice recognition systems for Gujarati.

References

Xu, J., Wang, X., Xu, S., & Liu, W. (2020). Deep multi-metric learning for text-independent speaker verification. Neurocomputing, 410,394–400.https://doi.org/10.1016/j.neucom.2020.06.045
Devi, K. J., Singh, N. H., & Thongam, K. (2020). Automatic Speaker Recognition from Speech Signals Using Self Organizing Feature Map and Hybrid Neural Network. Microprocessors and Microsystems, 79, 103264. https://doi.org/10.1016/j.micpro.2020.103264
Bian, T., Chen, F., & Xu, L. (2019). Self-attention-based speaker recognition using Cluster-Range Loss. Neurocomputing, 368, 59–68. https://doi.org/10.1016/j.neucom.2019.08.046
Maurya, A., Kumar, D. P., & Agarwal, R. (2018). Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach. Procedia Computer Science, 125, 880–887. https://doi.org/10.1016/j.procs.2017.12.112
Kinnunen, T., Karpov, E., & Fränti, P. (2006). Real-time speaker identification and verification. IEEE Transactions on Audio, Speech, and Language Processing, 14(1), 277–288. https://doi.org/10.1109/tsa.2005.853206
Gupta M, Singh RK, Singh S. G-Cocktail: An Algorithm to Address Cocktail Party Problem of Gujarati Language using CatBoost. Research Square; 2021. DOI: 10.21203/rs.3.rs-305722/v1.
Patel, J. A., & Nandurbarkar, A. B. (2015). Development and Implementation of Algorithms for Speaker recognition for Gujarati Language. International Research Journal of Engineering and Technology (IRJET).
Xu, J., Wang, X., Xu, S., & Liu, W. (2020b). Deep multi-metric learning for text-independent speaker verification. Neurocomputing, 410, 394–400. https://doi.org/10.1016/j.neucom.2020.06.045
Hanifa, R. M., Isa, K., & Mohamad, S. (2021). A review on speaker recognition: Technology and challenges. Computers & Electrical Engineering, 90, 107005. https://doi.org/10.1016/j.compeleceng.2021.107005
Mokgonyane, T. B., Sefara, T. J., Modipa, T. I., Mogale, M. M., Manamela, M. J., & Manamela, P. J. (2019). Automatic Speaker Recognition System based on Machine Learning Algorithms. 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA). https://doi.org/10.1109/robomech.2019.8704837
Kakade, M. N., & Salunke, D. B. (2020). An Automatic Real Time Speech-Speaker Recognition System: A Real Time Approach. Lecture Notes in Electrical Engineering, 151–158. https://doi.org/10.1007/978-981-13-8715-9_19
Tiwari, V., Hashmi, M. S., Keskar, A. G., & Shivaprakash, N. C. (2019). Speaker identification using multi-modal I-vector approach for varying length speech in voice interactive systems. Cognitive Systems Research, 57, 66–77. https://doi.org/10.1016/j.cogsys.2018.09.028
Ghoniem, R. M., & Shaalan, K. (2017). A Novel Arabic Text-independent Speaker Verification System based on Fuzzy Hidden Markov Model. Procedia Computer Science, 117, 274–286. https://doi.org/10.1016/j.procs.2017.10.119
Shahnawazuddin, S., Adiga, N., Sai, B. T., Ahmad, W., & Kathania, H. K. (2019). Developing speaker independent ASR system using limited data through prosody modification based on fuzzy classification of spectral bins. Digital Signal Processing, 93, 34–42. https://doi.org/10.1016/j.dsp.2019.06.015
Mehra, P., & Jain, P. (2021). ERIL: An Algorithm for Emotion Recognition from Indian Languages Using Machine Learning. Wireless Personal Communications. https://doi.org/10.21203/rs.3.rs-449758/v1
Nawaz, S., Saeed, M., Morerio, P., Mahmood, A., Gallo, I., Yousaf, M. H., & Del Bue, A. (2021). Cross-modal Speaker Verification and Recognition: A Multilingual Perspective. Computer Vision and Pattern Recognition. https://doi.org/10.1109/cvprw53098.2021.00184
Saleem, S., Subhan, F., Naseer, N., Bais, A., & Imtiaz, A. (2020). Forensic speaker recognition: A new method based on extracting accent and language information from short utterances. Forensic Science International: Digital Investigation, 34, 300982. https://doi.org/10.1016/j.fsidi.2020.300982
Mehra, P., & Verma, S. B. (2022). BERIS: An mBERT-based Emotion Recognition Algorithm from Indian Speech. ACM Transactions on Asian and Low-Resource Language Information Processing, 21(5), 1–19. https://doi.org/10.1145/3517195
Farsiani, S., Izadkhah, H., & Lotfi, S. (2022). An optimum end-to-end text-independent speaker identification system using convolutional neural networks. Computers & Electrical Engineering, 100, 107882. https://doi.org/10.1016/j.compeleceng.2022.107882
Patel H., Virparia P., - “Generic Model for Text Dependent Automatic Gujarati Speaker Recognition”, International Journal of Emerging Trends & Technology in Computer Science, Vol. 1, Issue 3, September – October 2012
Patel J., Patel P., and Virparia P., - “Voice Enabled Telephony Commands using Gujarati Speech Recognition”, International Journal of Advanced Research in Computer Science and Software Engineering”, Vol. 3, Issue 10, October 2013
Chojnacka, R., Pelecanos, J., Wang, Q., Moreno, I.L. (2021) SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System. Proc. Interspeech 2021, 1064-1068, doi:10.21437/Interspeech.2021-646
Purnima P., Bhatt S, - “Automatic Speech Recognition of Gujarati Digits using Dynamic Time Warping”, International Journal of Engineering and Innovative Technology, Vol. 3, Issue 12, June 2014
Rania M. Ghoniem, Khaled Shaalan, (2017), A Novel Arabic Text-independent Speaker Verification System based on Fuzzy Hidden Markov Model, Procedia Computer Science,Volume 117.
Kharibam Jilenkumari Devi, Nangbam Herojit Singh, Khelchandra Thongam,(2017),Automatic Speaker Recognition from Speech Signals Using Self Organizing Feature Map and Hybrid Neural Network,Microprocessors and Microsystems,Volume 79,(2020).
Tengyue Bian, Fangzhou Chen, Li Xu, (2019), Self-attention based speaker recognition using Cluster-Range Loss,Neurocomputing,Volume 368.
Ankur Maurya, Divya Kumar, R.K. Agarwal, (2018), Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach,Procedia Computer Science,Volume 125.
Shabnam Farsiani, Habib Izadkhah, Shahriar Lotfi, (2022), An optimum end-to-end text-independent speaker identification system using convolutional neural network,Computers and Electrical Engineering,Volume 100
M. M. Kabir, M. F. Mridha, J. Shin, I. Jahan and A. Q. Ohi, "A Survey of Speaker Recognition: Fundamental Theories, Recognition Methods and Opportunities," in IEEE Access, vol. 9, 2021.
Mohammad K. Nammous, Khalid Saeed, Paweł Kobojek,Using a small amount of text-independent speech data for a BiLSTM large-scale speaker identification approach, Journal of King Saud University - Computer and Information Sciences,Volume 34, Issue 3,2022.
Sajid Saleem, Fazli Subhan, Noman Naseer, Abdul Bais, Ammara Imtiaz,Forensic speaker recognition: A new method based on extracting accent and language information from short utterances,Forensic Science International: Digital Investigation,Volume 34,2020.
Monika Gupta, R K Singh, Sachin Singh et al. G-Cocktail: An Algorithm to Address Cocktail Party Problem of Gujarati Language using CatBoost, 17 March 2021.
B. Pandey, A. Ranjan, R. Kumar and A. Shukla, "Multilingual speaker recognition using ANFIS," 2010 2nd International Conference on Signal Processing Systems, 2010.
T. Kinnunen, E. Karpov and P. Franti, "Real-time speaker identification and verification," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 1, pp. 277-288, Jan. 2006.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Voice Recognition Speech Processing Gujrati Feature Extraction MFCC HMM