Contrastive Learning with Cross-Modal Knowledge Mining for Multimodal Human Activity Recognition

R. Brinzea*, B. Khaertdinov, S. Asteriadis

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Human Activity Recognition is a field of research where input data can take many forms. Each of the possible input modalities describes human behaviour in a different way, and each has its own strengths and weaknesses. We explore the hypothesis that leveraging multiple modalities can lead to better recognition. Since manual annotation of input data is expensive and time-consuming, the emphasis is made on self-supervised methods which can learn useful feature representations without any ground truth labels. We extend a number of recent contrastive self-supervised approaches for the task of Human Activity Recognition, leveraging inertial and skeleton data. Furthermore, we propose a flexible, general-purpose framework for performing multimodal self-supervised learning, named Contrastive Multiview Coding with Cross-Modal Knowledge Mining (CMC-CMKM). This framework exploits modality-specific knowledge in order to mitigate the limitations of typical self-supervised frameworks. The extensive experiments on two widely-used datasets demonstrate that the suggested framework significantly outperforms contrastive unimodal and multimodal baselines on different scenarios, including fully-supervised fine-tuning, activity retrieval and semi-supervised learning. Furthermore, it shows performance competitive even compared to supervised methods.
Original languageEnglish
Title of host publication2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
PublisherIEEE
Number of pages8
ISBN (Print)9781728186719
DOIs
Publication statusPublished - 2022
EventIEEE International Conference on Fuzzy Systems (FUZZ-IEEE) / IEEE World Congress on Computational Intelligence (IEEE WCCI) / International Joint Conference on Neural Networks (IJCNN) / IEEE Congress on Evolutionary Computation (IEEE CEC) - Padova, Italy
Duration: 18 Jul 202223 Jul 2022

Publication series

SeriesIEEE International Joint Conference on Neural Networks Proceedings
ISSN2161-4393

Conference

ConferenceIEEE International Conference on Fuzzy Systems (FUZZ-IEEE) / IEEE World Congress on Computational Intelligence (IEEE WCCI) / International Joint Conference on Neural Networks (IJCNN) / IEEE Congress on Evolutionary Computation (IEEE CEC)
Country/TerritoryItaly
CityPadova
Period18/07/2223/07/22

Keywords

  • Human Activity Recognition
  • self-supervised learning
  • multimodal fusion

Fingerprint

Dive into the research topics of 'Contrastive Learning with Cross-Modal Knowledge Mining for Multimodal Human Activity Recognition'. Together they form a unique fingerprint.

Cite this