Abstract
In this paper we present a novel approach towards multi-modal emotion recognition on a challenging dataset AFEW'16, composed of video clips labeled with the six basic emotions plus the neutral state. After a preprocessing stage, we employ different feature extraction techniques (CNN, DSIFT on face and facial ROI, geometric and audio based) and encoded frame-based features using Fisher vector representations. Next, we leverage the properties of each modality using different fusion schemes. Apart from the early-level fusion and the decision level fusion approaches, we propose a hierarchical decision level method based on information gain principles and we optimize its parameters using genetic algorithms. The experimental results prove the suitability of our method, as we obtain 53.06% validation accuracy, surpassing by 14% the baseline of 38.81% on a challenging dataset, suitable for emotion recognition in the wild.
Original language | English |
---|---|
Title of host publication | PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) |
Publisher | IEEE |
Pages | 814-823 |
Number of pages | 10 |
ISBN (Print) | 9781509064359 |
DOIs | |
Publication status | Published - 2017 |
Event | Intelligent Systems Conference (IntelliSys) - London, United Kingdom Duration: 7 Sept 2017 → 8 Sept 2017 https://saiconference.com/Conferences/IntelliSys2017#:~:text=The%202017%20edition%20of%20IntelliSys,intelligent%20systems%2C%20technologies%20and%20applications. |
Conference
Conference | Intelligent Systems Conference (IntelliSys) |
---|---|
Abbreviated title | IntelliSys 2017 |
Country/Territory | United Kingdom |
City | London |
Period | 7/09/17 → 8/09/17 |
Internet address |
Keywords
- Emotion recognition
- multimodal fusion
- information gain
- genetic algorithm
- EXPRESSION