Semantic Event Fusion of Different Visual Modality Concepts for Activity Recognition

被引：25

作者：

Crispim-Junior, Carlos F. ^{[1
]}

Buso, Vincent ^{[2
]}

Avgerinakis, Konstantinos ^{[3
]}

Meditskos, Georgios ^{[3
]}

Briassouli, Alexia ^{[3
]}

Benois-Pineau, Jenny ^{[2
]}

Kompatsiaris, Ioannis ^{[3
]}

Bremond, Francois ^{[1
]}

机构：

[1] INRIA Sophia Antipolis Mediterranee, STARS Team, Valbonne, France

[2] Univ Bordeaux, LABRI, Talence, France

[3] CERTH ITI, Thessaloniki, Greece

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2016年 / 38卷 / 08期

关键词：

Knowledge representation formalism and methods; uncertainty and probabilistic reasoning; concept synchronization; activity recognition; vision and scene understanding; multimedia perceptual system; TIME;

D O I：

10.1109/TPAMI.2016.2537323

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Combining multimodal concept streams from heterogeneous sensors is a problem superficially explored for activity recognition. Most studies explore simple sensors in nearly perfect conditions, where temporal synchronization is guaranteed. Sophisticated fusion schemes adopt problem-specific graphical representations of events that are generally deeply linked with their training data and focused on a single sensor. This paper proposes a hybrid framework between knowledge-driven and probabilistic-driven methods for event representation and recognition. It separates semantic modeling from raw sensor data by using an intermediate semantic representation, namely concepts. It introduces an algorithm for sensor alignment that uses concept similarity as a surrogate for the inaccurate temporal information of real life scenarios. Finally, it proposes the combined use of an ontology language, to overcome the rigidity of previous approaches at model definition, and a probabilistic interpretation for ontological models, which equips the framework with a mechanism to handle noisy and ambiguous concept observations, an ability that most knowledge-driven methods lack. We evaluate our contributions in multimodal recordings of elderly people carrying out IADLs. Results demonstrated that the proposed framework outperforms baseline methods both in event recognition performance and in delimiting the temporal boundaries of event instances.

引用

页码：1598 / 1611

页数：14

共 38 条

[1] MAINTAINING KNOWLEDGE ABOUT TEMPORAL INTERVALS [J].

ALLEN, JF .

COMMUNICATIONS OF THE ACM, 1983, 26 (11) :832-843

[2]

Nghiem AT, 2014, 2014 11TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), P241, DOI 10.1109/AVSS.2014.6918675

[3]

[Anonymous], 2010, P BRIT MACH VIS C 20

[4]

[Anonymous], 2007, INFORM RETRIEVAL MUS

[5]

[Anonymous], FRONTIERS AGING NEUR

[6]

[Anonymous], P 10 IEEE INT C ADV

[7]

[Anonymous], 2004, P 2004WORKSHOP STAT

[8]

[Anonymous], 2012, LNCS, DOI DOI 10.1007/978-3-642-35749-7_17

[9]

[Anonymous], 2011, INT C IM CRIM DET PR

[10]

[Anonymous], 2008, COMPUT VIS IMAGE UND, DOI DOI 10.1016/j.cviu.2007.09.014

← 1 2 3 4 →