Semantic Event Fusion of Different Visual Modality Concepts for Activity Recognition

Carlos F. Crispim-Junior; Vincent Buso; Konstantinos Avgerinakis; Georgios Meditskos; Alexia Briassouli; Jenny Benois-Pineau; Ioannis Kompatsiaris; Francois Bremond

doi:10.1109/TPAMI.2016.2537323

Semantic Event Fusion of Different Visual Modality Concepts for Activity Recognition

Carlos F. Crispim-Junior^*, Vincent Buso, Konstantinos Avgerinakis, Georgios Meditskos, Alexia Briassouli, Jenny Benois-Pineau, Ioannis Kompatsiaris, Francois Bremond

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Combining multimodal concept streams from heterogeneous sensors is a problem superficially explored for activity recognition. Most studies explore simple sensors in nearly perfect conditions, where temporal synchronization is guaranteed. Sophisticated fusion schemes adopt problem-specific graphical representations of events that are generally deeply linked with their training data and focused on a single sensor. This paper proposes a hybrid framework between knowledge-driven and probabilistic-driven methods for event representation and recognition. It separates semantic modeling from raw sensor data by using an intermediate semantic representation, namely concepts. It introduces an algorithm for sensor alignment that uses concept similarity as a surrogate for the inaccurate temporal information of real life scenarios. Finally, it proposes the combined use of an ontology language, to overcome the rigidity of previous approaches at model definition, and a probabilistic interpretation for ontological models, which equips the framework with a mechanism to handle noisy and ambiguous concept observations, an ability that most knowledge-driven methods lack. We evaluate our contributions in multimodal recordings of elderly people carrying out IADLs. Results demonstrated that the proposed framework outperforms baseline methods both in event recognition performance and in delimiting the temporal boundaries of event instances.

Original language	English
Pages (from-to)	1598-1611
Number of pages	14
Journal	Ieee Transactions on Pattern Analysis and Machine Intelligence
Volume	38
Issue number	8
DOIs	https://doi.org/10.1109/TPAMI.2016.2537323
Publication status	Published - Aug 2016
Externally published	Yes

Keywords

FEATURES
Knowledge representation formalism and methods
TIME
activity recognition
concept synchronization
multimedia perceptual system
uncertainty and probabilistic reasoning
vision and scene understanding

Access to Document

10.1109/TPAMI.2016.2537323

Cite this

@article{5203b0aff95d4f86a9e7bdc19b632d1a,

title = "Semantic Event Fusion of Different Visual Modality Concepts for Activity Recognition",

abstract = "Combining multimodal concept streams from heterogeneous sensors is a problem superficially explored for activity recognition. Most studies explore simple sensors in nearly perfect conditions, where temporal synchronization is guaranteed. Sophisticated fusion schemes adopt problem-specific graphical representations of events that are generally deeply linked with their training data and focused on a single sensor. This paper proposes a hybrid framework between knowledge-driven and probabilistic-driven methods for event representation and recognition. It separates semantic modeling from raw sensor data by using an intermediate semantic representation, namely concepts. It introduces an algorithm for sensor alignment that uses concept similarity as a surrogate for the inaccurate temporal information of real life scenarios. Finally, it proposes the combined use of an ontology language, to overcome the rigidity of previous approaches at model definition, and a probabilistic interpretation for ontological models, which equips the framework with a mechanism to handle noisy and ambiguous concept observations, an ability that most knowledge-driven methods lack. We evaluate our contributions in multimodal recordings of elderly people carrying out IADLs. Results demonstrated that the proposed framework outperforms baseline methods both in event recognition performance and in delimiting the temporal boundaries of event instances.",

keywords = "FEATURES, Knowledge representation formalism and methods, TIME, activity recognition, concept synchronization, multimedia perceptual system, uncertainty and probabilistic reasoning, vision and scene understanding",

author = "Crispim-Junior, {Carlos F.} and Vincent Buso and Konstantinos Avgerinakis and Georgios Meditskos and Alexia Briassouli and Jenny Benois-Pineau and Ioannis Kompatsiaris and Francois Bremond",

year = "2016",

month = aug,

doi = "10.1109/TPAMI.2016.2537323",

language = "English",

volume = "38",

pages = "1598--1611",

journal = "Ieee Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE Computer Society",

number = "8",

}

TY - JOUR

T1 - Semantic Event Fusion of Different Visual Modality Concepts for Activity Recognition

AU - Crispim-Junior, Carlos F.

AU - Buso, Vincent

AU - Avgerinakis, Konstantinos

AU - Meditskos, Georgios

AU - Briassouli, Alexia

AU - Benois-Pineau, Jenny

AU - Kompatsiaris, Ioannis

AU - Bremond, Francois

PY - 2016/8

Y1 - 2016/8

N2 - Combining multimodal concept streams from heterogeneous sensors is a problem superficially explored for activity recognition. Most studies explore simple sensors in nearly perfect conditions, where temporal synchronization is guaranteed. Sophisticated fusion schemes adopt problem-specific graphical representations of events that are generally deeply linked with their training data and focused on a single sensor. This paper proposes a hybrid framework between knowledge-driven and probabilistic-driven methods for event representation and recognition. It separates semantic modeling from raw sensor data by using an intermediate semantic representation, namely concepts. It introduces an algorithm for sensor alignment that uses concept similarity as a surrogate for the inaccurate temporal information of real life scenarios. Finally, it proposes the combined use of an ontology language, to overcome the rigidity of previous approaches at model definition, and a probabilistic interpretation for ontological models, which equips the framework with a mechanism to handle noisy and ambiguous concept observations, an ability that most knowledge-driven methods lack. We evaluate our contributions in multimodal recordings of elderly people carrying out IADLs. Results demonstrated that the proposed framework outperforms baseline methods both in event recognition performance and in delimiting the temporal boundaries of event instances.

AB - Combining multimodal concept streams from heterogeneous sensors is a problem superficially explored for activity recognition. Most studies explore simple sensors in nearly perfect conditions, where temporal synchronization is guaranteed. Sophisticated fusion schemes adopt problem-specific graphical representations of events that are generally deeply linked with their training data and focused on a single sensor. This paper proposes a hybrid framework between knowledge-driven and probabilistic-driven methods for event representation and recognition. It separates semantic modeling from raw sensor data by using an intermediate semantic representation, namely concepts. It introduces an algorithm for sensor alignment that uses concept similarity as a surrogate for the inaccurate temporal information of real life scenarios. Finally, it proposes the combined use of an ontology language, to overcome the rigidity of previous approaches at model definition, and a probabilistic interpretation for ontological models, which equips the framework with a mechanism to handle noisy and ambiguous concept observations, an ability that most knowledge-driven methods lack. We evaluate our contributions in multimodal recordings of elderly people carrying out IADLs. Results demonstrated that the proposed framework outperforms baseline methods both in event recognition performance and in delimiting the temporal boundaries of event instances.

KW - FEATURES

KW - Knowledge representation formalism and methods

KW - TIME

KW - activity recognition

KW - concept synchronization

KW - multimedia perceptual system

KW - uncertainty and probabilistic reasoning

KW - vision and scene understanding

U2 - 10.1109/TPAMI.2016.2537323

DO - 10.1109/TPAMI.2016.2537323

M3 - Article

C2 - 26955015

SN - 0162-8828

VL - 38

SP - 1598

EP - 1611

JO - Ieee Transactions on Pattern Analysis and Machine Intelligence

JF - Ieee Transactions on Pattern Analysis and Machine Intelligence

IS - 8

ER -