Multilevel Feature Representation for Hybrid Transformers-based Emotion Recognition

Monorama Swain, Bubai Maji, Mustaqeem Khan, Abdulmotaleb El Saddik, Wail Gueaieb

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

Automated Speech Emotion Recognition (SER) systems and human-computer interaction systems are both heavily reliant on emotion. Global and temporal representation of utterances is crucial to the effectiveness of an SER module. Research conducted by the author demonstrates that the temporal data gathered by the transformer can significantly improve the SER system's overall recognition rate. There are some limitations to all of the existing hybrid models, despite the fact that the performance of hybrid models is higher than that of conventional classifiers. Despite this, the relationship between different speech cues and the learning of high-level global and temporal cues using a transformer has not been studied thoroughly. As a result, this research discovered an efficient transformer-based hybrid technique for emotion recognition via multilevel feature representation of speech signals. To learn deeper information from global and temporal representations, the proposed method comprises a parallel convolutional encoder, a spatial encoder, and a sequential encoder. Furthermore, the learned cues pass through the proposed transformer to capture the salient information for a specific emotion in the input sequence. To verify its effectiveness, we evaluated the proposed approach and achieved state-of-the-art (SOTA) results 75.29% and 88.18% weighted, and 76.34% and 88.49% unweighted accuracy on the IEMOCAP and SITB-OSED corpora.

Original languageEnglish
Title of host publicationBioSMART 2023 - Proceedings
Subtitle of host publication5th International Conference on Bio-Engineering for Smart Technologies
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350338492
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event5th International Conference on Bio-engineering for Smart Technologies, BioSMART 2023 - Paris, France
Duration: Jun 7 2023Jun 9 2023

Publication series

NameBioSMART 2023 - Proceedings: 5th International Conference on Bio-Engineering for Smart Technologies

Conference

Conference5th International Conference on Bio-engineering for Smart Technologies, BioSMART 2023
Country/TerritoryFrance
CityParis
Period6/7/236/9/23

Keywords

  • Emotion Recognition
  • Human-Computer Interaction
  • Hybrid Transformer
  • Multilevel Feature Representation
  • Speech Signal

ASJC Scopus subject areas

  • Infectious Diseases
  • Psychiatry and Mental health
  • Artificial Intelligence
  • Computer Science Applications
  • Biomedical Engineering
  • Health Informatics

Fingerprint

Dive into the research topics of 'Multilevel Feature Representation for Hybrid Transformers-based Emotion Recognition'. Together they form a unique fingerprint.

Cite this