Research Article

Survey on Various Methods of Text to Speech Synthesis

by  Desai Siddhi, Jashin M. Verghese, Desai Bhavik
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 165 - Issue 6
Published: May 2017
Authors: Desai Siddhi, Jashin M. Verghese, Desai Bhavik
10.5120/ijca2017913891
PDF

Desai Siddhi, Jashin M. Verghese, Desai Bhavik . Survey on Various Methods of Text to Speech Synthesis. International Journal of Computer Applications. 165, 6 (May 2017), 26-30. DOI=10.5120/ijca2017913891

                        @article{ 10.5120/ijca2017913891,
                        author  = { Desai Siddhi,Jashin M. Verghese,Desai Bhavik },
                        title   = { Survey on Various Methods of Text to Speech Synthesis },
                        journal = { International Journal of Computer Applications },
                        year    = { 2017 },
                        volume  = { 165 },
                        number  = { 6 },
                        pages   = { 26-30 },
                        doi     = { 10.5120/ijca2017913891 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2017
                        %A Desai Siddhi
                        %A Jashin M. Verghese
                        %A Desai Bhavik
                        %T Survey on Various Methods of Text to Speech Synthesis%T 
                        %J International Journal of Computer Applications
                        %V 165
                        %N 6
                        %P 26-30
                        %R 10.5120/ijca2017913891
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

The primary objective of this paper is to provide an overview of existing methods Text-To-Speech synthesis techniques. Text to speech synthesis can be broadly categorized into three categories, formant Based, Concatenative based and Articulatory. Formant based speech synthesis relies on different techniques such as cascade, parallel, klatt and PARCAS Model etc. Concatenative speech synthesis can be broadly categorized into three categories, Diphones Based, Corpus based and Hybrid whereas Articulatory synthesis involves Vocal Tract Models, Acoustic Models, Glottis Models , Noise Source Models . In this paper, all text to speech synthesis methods are explained with their pros and cones.

References
  • Sami Lemmetty. Review of Speech Synthesis Technology. Helsinki University of Technology Department of Electrical and Communications Engineering. March 30, 1999.
  • Rubeena A. Khan , J. S. Chitode, Concatenative Speech Synthesis: A Review, International Journal of Computer Applications (0975 – 8887). Volume 136 – No.3, February 2016.pg-1 to 4.
  • Raitio, Tuomo, et al. "HMM-based speech synthesis utilizing glottal inverse filtering." Audio, Speech, and Language Processing, IEEE Transactions on vol.19, no.1, 2011, pp. 153-165.
  • Heiga Zen, Keiichi Tokuda, Alan W. Black ,“Statistical parametric speech synthesis”, Speech Communication vol.51,no.11,2009,pp. 1039–1064.
  • Stas Tiomkin, David Malah, Slava Shechtman, and Zvi Kons, “A hybrid text-to-speech system that combines concatenative and statistical synthesis units” IEEE Transactions on Audio, SPEECH, and Language Processing, vol. 19, no. 5, JULY 2011 pp 1278-1288.
  • Pertti Palo. A Review of Articulatory Speech Synthesis. Espoo, June 5, 2006
  • Bernd J. Kröger,Peter Birkholz. Articulatory Synthesis of Speech and Singing: State of the Art and Suggestions for Future Research. Multimodal Signals: Cognitive and Algorithmic Issues. pp 306-319
  • Birkholz P, Martin L, Willmes K, Kröger BJ, Neuschaefer-Rube C (2015) The contribution of phonation type to the perception of vocal emotions in German: An articulatory synthesis study. Journal of the Acoustical Society of America 137:1503-1512
  • Louis Goldstein and Carol A. Fowler. Articulatory Phonology: A phonology for public language use
  • Richard S, Mc gowan and Alice Faber. Introduction to papers on speech recognition and perception from an articulatory point of view.
  • Shuangyu Chang. A Syllable, Articulatory-F eature, and Stress-Accent Model of Speech Recognition. September 2002
  • Kelly and Lochbaum 1962, Liljencrants 1985, Meyer et al. 1989, Kröger 1998.(e.g. Flanagan 1975, Maeda 1982, Birkholz et al. 2007.
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Text to speech synthesis Formant speech synthesis Concatenative speech synthesis Articulatory speech synthesis

Powered by PhDFocusTM