Survey on Various Methods of Text to Speech Synthesis

Desai Siddhi; Jashin M. Verghese; Desai Bhavik

Research Article

Survey on Various Methods of Text to Speech Synthesis

by Desai Siddhi, Jashin M. Verghese, Desai Bhavik

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 165 - Issue 6

Published: May 2017

Authors: Desai Siddhi, Jashin M. Verghese, Desai Bhavik

10.5120/ijca2017913891

PDF

Desai Siddhi, Jashin M. Verghese, Desai Bhavik . Survey on Various Methods of Text to Speech Synthesis. International Journal of Computer Applications. 165, 6 (May 2017), 26-30. DOI=10.5120/ijca2017913891

                        @article{ 10.5120/ijca2017913891,
                        author  = { Desai Siddhi,Jashin M. Verghese,Desai Bhavik },
                        title   = { Survey on Various Methods of Text to Speech Synthesis },
                        journal = { International Journal of Computer Applications },
                        year    = { 2017 },
                        volume  = { 165 },
                        number  = { 6 },
                        pages   = { 26-30 },
                        doi     = { 10.5120/ijca2017913891 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2017
                        %A Desai Siddhi
                        %A Jashin M. Verghese
                        %A Desai Bhavik
                        %T Survey on Various Methods of Text to Speech Synthesis%T 
                        %J International Journal of Computer Applications
                        %V 165
                        %N 6
                        %P 26-30
                        %R 10.5120/ijca2017913891
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

The primary objective of this paper is to provide an overview of existing methods Text-To-Speech synthesis techniques. Text to speech synthesis can be broadly categorized into three categories, formant Based, Concatenative based and Articulatory. Formant based speech synthesis relies on different techniques such as cascade, parallel, klatt and PARCAS Model etc. Concatenative speech synthesis can be broadly categorized into three categories, Diphones Based, Corpus based and Hybrid whereas Articulatory synthesis involves Vocal Tract Models, Acoustic Models, Glottis Models , Noise Source Models . In this paper, all text to speech synthesis methods are explained with their pros and cones.

References

Sami Lemmetty. Review of Speech Synthesis Technology. Helsinki University of Technology Department of Electrical and Communications Engineering. March 30, 1999.
Rubeena A. Khan , J. S. Chitode, Concatenative Speech Synthesis: A Review, International Journal of Computer Applications (0975 – 8887). Volume 136 – No.3, February 2016.pg-1 to 4.
Raitio, Tuomo, et al. "HMM-based speech synthesis utilizing glottal inverse filtering." Audio, Speech, and Language Processing, IEEE Transactions on vol.19, no.1, 2011, pp. 153-165.
Heiga Zen, Keiichi Tokuda, Alan W. Black ,“Statistical parametric speech synthesis”, Speech Communication vol.51,no.11,2009,pp. 1039–1064.
Stas Tiomkin, David Malah, Slava Shechtman, and Zvi Kons, “A hybrid text-to-speech system that combines concatenative and statistical synthesis units” IEEE Transactions on Audio, SPEECH, and Language Processing, vol. 19, no. 5, JULY 2011 pp 1278-1288.
Pertti Palo. A Review of Articulatory Speech Synthesis. Espoo, June 5, 2006
Bernd J. Kröger,Peter Birkholz. Articulatory Synthesis of Speech and Singing: State of the Art and Suggestions for Future Research. Multimodal Signals: Cognitive and Algorithmic Issues. pp 306-319
Birkholz P, Martin L, Willmes K, Kröger BJ, Neuschaefer-Rube C (2015) The contribution of phonation type to the perception of vocal emotions in German: An articulatory synthesis study. Journal of the Acoustical Society of America 137:1503-1512
Louis Goldstein and Carol A. Fowler. Articulatory Phonology: A phonology for public language use
Richard S, Mc gowan and Alice Faber. Introduction to papers on speech recognition and perception from an articulatory point of view.
Shuangyu Chang. A Syllable, Articulatory-F eature, and Stress-Accent Model of Speech Recognition. September 2002
Kelly and Lochbaum 1962, Liljencrants 1985, Meyer et al. 1989, Kröger 1998.(e.g. Flanagan 1975, Maeda 1982, Birkholz et al. 2007.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Text to speech synthesis Formant speech synthesis Concatenative speech synthesis Articulatory speech synthesis