|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 187 - Issue 63 |
| Published: December 2025 |
| Authors: Ruwini Madhushika Herath |
10.5120/ijca2025926040
|
Ruwini Madhushika Herath . Managing Distribution Shift in Speech Emotion Recognition: An Empirical Study with Confidence-Based Filtering. International Journal of Computer Applications. 187, 63 (December 2025), 26-33. DOI=10.5120/ijca2025926040
@article{ 10.5120/ijca2025926040,
author = { Ruwini Madhushika Herath },
title = { Managing Distribution Shift in Speech Emotion Recognition: An Empirical Study with Confidence-Based Filtering },
journal = { International Journal of Computer Applications },
year = { 2025 },
volume = { 187 },
number = { 63 },
pages = { 26-33 },
doi = { 10.5120/ijca2025926040 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2025
%A Ruwini Madhushika Herath
%T Managing Distribution Shift in Speech Emotion Recognition: An Empirical Study with Confidence-Based Filtering%T
%J International Journal of Computer Applications
%V 187
%N 63
%P 26-33
%R 10.5120/ijca2025926040
%I Foundation of Computer Science (FCS), NY, USA
Speech emotion recognition (SER) plays an important role in human–computer interaction, healthcare, and customer service. Yet SER models often degrade when applied across genders or to external corpora, limiting their reliability in real-world deployments. This study investigates the robustness of classical classifiers- Logistic Regression, Random Forests, and XGBoost, under gender and domain shifts, with a focus on confidence-based routing as a mitigation strategy. In-domain experiments demonstrated strong performance for tree-based ensembles, with Random Forests achieving up to 0.879 accuracy and XGBoost 0.914 on gender-specific training, while Logistic Regression performed poorly (0.478). Cross-domain evaluation on the RAVDESS corpus revealed sharp declines: Random Forest accuracy dropped to 0.466, and XGBoost models failed in cross-gender transfer (0.266–0.311). High-arousal emotions generalized more reliably than low-arousal categories, which exhibited widespread misclassification. A confidence-filtering mechanism was introduced to improve reliability. With a threshold of ≥0.60, Random Forest accuracy recovered to 0.811 (macro-F1 = 0.602) on a small subset of 7% of predictions. While limited in coverage, this serves as a proof-of-concept that selective prediction can recover trustworthy outputs under distribution shift. These findings highlight the limitations of current SER models under distribution shift but also suggest a practical path forward. For both emotion recognition and future stress detection, incorporating confidence-aware routing may be as important as improving raw accuracy, enabling selective and trustworthy predictions in sensitive applications.