Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network

Abdul Malik Badshah, Jamil Ahmad, Nasir Rahim, Sung Wook Baik

Research output: Chapter in Book/Report/Conference proceedingConference contribution

354 Citations (Scopus)

Abstract

This paper presents a method for speech emotion recognition using spectrograms and deep convolutional neural network (CNN). Spectrograms generated from the speech signals are input to the deep CNN. The proposed model consisting of three convolutional layers and three fully connected layers extract discriminative features from spectrogram images and outputs predictions for the seven emotions. In this study, we trained the proposed model on spectrograms obtained from Berlin emotions dataset. Furthermore, we also investigated the effectiveness of transfer learning for emotions recognition using a pre-trained AlexNet model. Preliminary results indicate that the proposed approach based on freshly trained model is better than the fine-tuned model, and is capable of predicting emotions accurately and efficiently.

Original languageEnglish
Title of host publication2017 International Conference on Platform Technology and Service, PlatCon 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509051403
DOIs
Publication statusPublished - Mar 20 2017
Externally publishedYes
Event4th International Conference on Platform Technology and Service, PlatCon 2017 - Busan, Korea, Republic of
Duration: Feb 13 2017Feb 15 2017

Publication series

Name2017 International Conference on Platform Technology and Service, PlatCon 2017 - Proceedings

Conference

Conference4th International Conference on Platform Technology and Service, PlatCon 2017
Country/TerritoryKorea, Republic of
CityBusan
Period2/13/172/15/17

Keywords

  • convolutional neural network
  • emotions
  • speech

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Education

Fingerprint

Dive into the research topics of 'Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network'. Together they form a unique fingerprint.

Cite this