International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 186 - Issue 37 |
Published: August 2024 |
Authors: Purnima Chandrasekar, Shailendra Pratap Shastri |
![]() |
Purnima Chandrasekar, Shailendra Pratap Shastri . Step-by-step Approach to Automatic Speech Emotion Recognition. International Journal of Computer Applications. 186, 37 (August 2024), 37-43. DOI=10.5120/ijca2024923947
@article{ 10.5120/ijca2024923947, author = { Purnima Chandrasekar,Shailendra Pratap Shastri }, title = { Step-by-step Approach to Automatic Speech Emotion Recognition }, journal = { International Journal of Computer Applications }, year = { 2024 }, volume = { 186 }, number = { 37 }, pages = { 37-43 }, doi = { 10.5120/ijca2024923947 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2024 %A Purnima Chandrasekar %A Shailendra Pratap Shastri %T Step-by-step Approach to Automatic Speech Emotion Recognition%T %J International Journal of Computer Applications %V 186 %N 37 %P 37-43 %R 10.5120/ijca2024923947 %I Foundation of Computer Science (FCS), NY, USA
Humans use emotions to express themselves naturally either through facial expressions or through speech. Emotions play an important role in influencing the decision-making capability of human beings as human mind is influenced by personal experiences as well as physiological, communicative and behavioral reaction to external stimulus. While considering emotions displayed through speech, one needs to understand that a speech signal not only conveys the emotional state of the speaker which is visible from the intent of the message as well as the gender of the person and the language spoken. While an effective communication between humans through speech ensures exchange of right amount of ideas, messages and perceptions, interaction between human and machine with the same intent becomes challenging as a machine is expected to mimic the mechanism of human perception. Automatic Speech Emotion recognition (ASER) systems has found usefulness in several applications viz. healthcare, counseling, call center communication etc. Primary to this system are three basic components viz. creation of emotional speech corpus, extraction of features relevant to emotion detection and classification of emotion in the test speech using appropriate classifiers. This paper surveys extensively the prominent features extracted, several dimension reduction techniques and classifiers commonly used in recent times. It also throws light on the concept of auto encoders being used in recent times in the process of ASER.