Research Article

Voice Recognition for Gujarati Dialects: An in-depth Survey

by  Meera M. Shah, Hiren R. Kavathiya
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Issue 5
Published: Jan 2024
Authors: Meera M. Shah, Hiren R. Kavathiya
10.5120/ijca2024923112
PDF

Meera M. Shah, Hiren R. Kavathiya . Voice Recognition for Gujarati Dialects: An in-depth Survey. International Journal of Computer Applications. 186, 5 (Jan 2024), 1-4. DOI=10.5120/ijca2024923112

                        @article{ 10.5120/ijca2024923112,
                        author  = { Meera M. Shah,Hiren R. Kavathiya },
                        title   = { Voice Recognition for Gujarati Dialects: An in-depth Survey },
                        journal = { International Journal of Computer Applications },
                        year    = { 2024 },
                        volume  = { 186 },
                        number  = { 5 },
                        pages   = { 1-4 },
                        doi     = { 10.5120/ijca2024923112 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2024
                        %A Meera M. Shah
                        %A Hiren R. Kavathiya
                        %T Voice Recognition for Gujarati Dialects: An in-depth Survey%T 
                        %J International Journal of Computer Applications
                        %V 186
                        %N 5
                        %P 1-4
                        %R 10.5120/ijca2024923112
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Voice recognition technology nowadays is gaining so much importance, and plenty of work has been done on it for different languages like English, Arabic, Hindi, Chinese, etc. But when we talk about a language like Gujarati, we find a particular lack of work. In this paper, we examined the process of voice recognition in Gujarati. The systematic literature review for voice recognition has been shown here. This paper mainly focuses on the problems that can be found in voice recognition systems for Gujarati.

References
  • Xu, J., Wang, X., Xu, S., & Liu, W. (2020). Deep multi-metric learning for text-independent speaker verification. Neurocomputing, 410,394–400.https://doi.org/10.1016/j.neucom.2020.06.045
  • Devi, K. J., Singh, N. H., & Thongam, K. (2020). Automatic Speaker Recognition from Speech Signals Using Self Organizing Feature Map and Hybrid Neural Network. Microprocessors and Microsystems, 79, 103264. https://doi.org/10.1016/j.micpro.2020.103264
  • Bian, T., Chen, F., & Xu, L. (2019). Self-attention-based speaker recognition using Cluster-Range Loss. Neurocomputing, 368, 59–68. https://doi.org/10.1016/j.neucom.2019.08.046
  • Maurya, A., Kumar, D. P., & Agarwal, R. (2018). Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach. Procedia Computer Science, 125, 880–887. https://doi.org/10.1016/j.procs.2017.12.112
  • Kinnunen, T., Karpov, E., & Fränti, P. (2006). Real-time speaker identification and verification. IEEE Transactions on Audio, Speech, and Language Processing, 14(1), 277–288. https://doi.org/10.1109/tsa.2005.853206
  • Gupta M, Singh RK, Singh S. G-Cocktail: An Algorithm to Address Cocktail Party Problem of Gujarati Language using CatBoost. Research Square; 2021. DOI: 10.21203/rs.3.rs-305722/v1.
  • Patel, J. A., & Nandurbarkar, A. B. (2015). Development and Implementation of Algorithms for Speaker recognition for Gujarati Language. International Research Journal of Engineering and Technology (IRJET).
  • Xu, J., Wang, X., Xu, S., & Liu, W. (2020b). Deep multi-metric learning for text-independent speaker verification. Neurocomputing, 410, 394–400. https://doi.org/10.1016/j.neucom.2020.06.045
  • Hanifa, R. M., Isa, K., & Mohamad, S. (2021). A review on speaker recognition: Technology and challenges. Computers & Electrical Engineering, 90, 107005. https://doi.org/10.1016/j.compeleceng.2021.107005
  • Mokgonyane, T. B., Sefara, T. J., Modipa, T. I., Mogale, M. M., Manamela, M. J., & Manamela, P. J. (2019). Automatic Speaker Recognition System based on Machine Learning Algorithms. 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA). https://doi.org/10.1109/robomech.2019.8704837
  • Kakade, M. N., & Salunke, D. B. (2020). An Automatic Real Time Speech-Speaker Recognition System: A Real Time Approach. Lecture Notes in Electrical Engineering, 151–158. https://doi.org/10.1007/978-981-13-8715-9_19
  • Tiwari, V., Hashmi, M. S., Keskar, A. G., & Shivaprakash, N. C. (2019). Speaker identification using multi-modal I-vector approach for varying length speech in voice interactive systems. Cognitive Systems Research, 57, 66–77. https://doi.org/10.1016/j.cogsys.2018.09.028
  • Ghoniem, R. M., & Shaalan, K. (2017). A Novel Arabic Text-independent Speaker Verification System based on Fuzzy Hidden Markov Model. Procedia Computer Science, 117, 274–286. https://doi.org/10.1016/j.procs.2017.10.119
  • Shahnawazuddin, S., Adiga, N., Sai, B. T., Ahmad, W., & Kathania, H. K. (2019). Developing speaker independent ASR system using limited data through prosody modification based on fuzzy classification of spectral bins. Digital Signal Processing, 93, 34–42. https://doi.org/10.1016/j.dsp.2019.06.015
  • Mehra, P., & Jain, P. (2021). ERIL: An Algorithm for Emotion Recognition from Indian Languages Using Machine Learning. Wireless Personal Communications. https://doi.org/10.21203/rs.3.rs-449758/v1
  • Nawaz, S., Saeed, M., Morerio, P., Mahmood, A., Gallo, I., Yousaf, M. H., & Del Bue, A. (2021). Cross-modal Speaker Verification and Recognition: A Multilingual Perspective. Computer Vision and Pattern Recognition. https://doi.org/10.1109/cvprw53098.2021.00184
  • Saleem, S., Subhan, F., Naseer, N., Bais, A., & Imtiaz, A. (2020). Forensic speaker recognition: A new method based on extracting accent and language information from short utterances. Forensic Science International: Digital Investigation, 34, 300982. https://doi.org/10.1016/j.fsidi.2020.300982
  • Mehra, P., & Verma, S. B. (2022). BERIS: An mBERT-based Emotion Recognition Algorithm from Indian Speech. ACM Transactions on Asian and Low-Resource Language Information Processing, 21(5), 1–19. https://doi.org/10.1145/3517195
  • Farsiani, S., Izadkhah, H., & Lotfi, S. (2022). An optimum end-to-end text-independent speaker identification system using convolutional neural networks. Computers & Electrical Engineering, 100, 107882. https://doi.org/10.1016/j.compeleceng.2022.107882
  • Patel H., Virparia P., - “Generic Model for Text Dependent Automatic Gujarati Speaker Recognition”, International Journal of Emerging Trends & Technology in Computer Science, Vol. 1, Issue 3, September – October 2012
  • Patel J., Patel P., and Virparia P., - “Voice Enabled Telephony Commands using Gujarati Speech Recognition”, International Journal of Advanced Research in Computer Science and Software Engineering”, Vol. 3, Issue 10, October 2013
  • Chojnacka, R., Pelecanos, J., Wang, Q., Moreno, I.L. (2021) SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System. Proc. Interspeech 2021, 1064-1068, doi:10.21437/Interspeech.2021-646
  • Purnima P., Bhatt S, - “Automatic Speech Recognition of Gujarati Digits using Dynamic Time Warping”, International Journal of Engineering and Innovative Technology, Vol. 3, Issue 12, June 2014
  • Rania M. Ghoniem, Khaled Shaalan, (2017), A Novel Arabic Text-independent Speaker Verification System based on Fuzzy Hidden Markov Model, Procedia Computer Science,Volume 117.
  • Kharibam Jilenkumari Devi, Nangbam Herojit Singh, Khelchandra Thongam,(2017),Automatic Speaker Recognition from Speech Signals Using Self Organizing Feature Map and Hybrid Neural Network,Microprocessors and Microsystems,Volume 79,(2020).
  • Tengyue Bian, Fangzhou Chen, Li Xu, (2019), Self-attention based speaker recognition using Cluster-Range Loss,Neurocomputing,Volume 368.
  • Ankur Maurya, Divya Kumar, R.K. Agarwal, (2018), Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach,Procedia Computer Science,Volume 125.
  • Shabnam Farsiani, Habib Izadkhah, Shahriar Lotfi, (2022), An optimum end-to-end text-independent speaker identification system using convolutional neural network,Computers and Electrical Engineering,Volume 100
  • M. M. Kabir, M. F. Mridha, J. Shin, I. Jahan and A. Q. Ohi, "A Survey of Speaker Recognition: Fundamental Theories, Recognition Methods and Opportunities," in IEEE Access, vol. 9, 2021.
  • Mohammad K. Nammous, Khalid Saeed, Paweł Kobojek,Using a small amount of text-independent speech data for a BiLSTM large-scale speaker identification approach, Journal of King Saud University - Computer and Information Sciences,Volume 34, Issue 3,2022.
  • Sajid Saleem, Fazli Subhan, Noman Naseer, Abdul Bais, Ammara Imtiaz,Forensic speaker recognition: A new method based on extracting accent and language information from short utterances,Forensic Science International: Digital Investigation,Volume 34,2020.
  • Monika Gupta, R K Singh, Sachin Singh et al. G-Cocktail: An Algorithm to Address Cocktail Party Problem of Gujarati Language using CatBoost, 17 March 2021.
  • B. Pandey, A. Ranjan, R. Kumar and A. Shukla, "Multilingual speaker recognition using ANFIS," 2010 2nd International Conference on Signal Processing Systems, 2010.
  • T. Kinnunen, E. Karpov and P. Franti, "Real-time speaker identification and verification," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 1, pp. 277-288, Jan. 2006.
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Voice Recognition Speech Processing Gujrati Feature Extraction MFCC HMM

Powered by PhDFocusTM