Temporal based Emotion Recognition inspired by Activity Recognition models

Balaganesh Mohan*, Mirela Popa

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Affective computing is a subset of the larger field of human-computer interaction, having important connections with cognitive processes, influencing the learning process, decision-making and perception. Out of the multiple means of communication, facial expressions are one of the most widely accepted channels for emotion modulation, receiving an increased attention during the last few years. An important aspect, contributing to their recognition success, concerns modeling the temporal dimension. Therefore, this paper aims to investigate the applicability of current state-of-the-art action recognition techniques to the human emotion recognition task. In particular, two different architectures were investigated, a CNN-based model, named Temporal Shift Module (TSM) that can learn spatiotemporal features in 3D data with the computational complexity of a 2D CNN and a video based vision transformer, employing spatio-temporal self attention. The models were trained and tested on the CREMA-D dataset, demonstrating state-of-the-art performance, with a mean class accuracy of 82% and 77% respectively, while outperforming best previous approaches by at least 3.5%
Original languageEnglish
Title of host publication2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)
PublisherIEEE
Pages01-08
Number of pages8
DOIs
Publication statusPublished - 1 Sept 2021
Event2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos - Nara, Japan
Duration: 28 Sept 20211 Oct 2021
Conference number: 29

Conference

Conference2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos
Abbreviated titleACIIW 2021
Country/TerritoryJapan
CityNara
Period28/09/211/10/21

Keywords

  • Temporal shift module(TSM)
  • Vision transformers
  • Emotion recognition
  • Action recognition

Cite this