Learning Relational Event Models from Video

被引:29
|
作者
Dubba, Krishna S. R. [1 ]
Cohn, Anthony G. [1 ]
Hogg, David C. [1 ]
Bhatt, Mehut [2 ]
Dylla, Frank [2 ]
机构
[1] Univ Leeds, Sch Comp, Leeds LS2 9JT, W Yorkshire, England
[2] Univ Bremen, Cognit Syst, SFB TR Spatial Cognit 8, D-28334 Bremen, Germany
关键词
REPRESENTATION; DEFINITIONS; RECOGNITION;
D O I
10.1613/jair.4395
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Event models obtained automatically from video can be used in applications ranging from abnormal event detection to content based video retrieval. When multiple agents are involved in the events, characterizing events naturally suggests encoding interactions as relations. Learning event models from this kind of relational spatio-temporal data using relational learning techniques such as Inductive Logic Programming (ILP) hold promise, but have not been successfully applied to very large datasets which result from video data. In this paper, we present a novel framework REMIND (Relational Event Model INDuction) for supervised relational learning of event models from large video datasets using ILP. Efficiency is achieved through the learning from interpretations setting and using a typing system that exploits the type hierarchy of objects in a domain. The use of types also helps prevent over generalization. Furthermore, we also present a type-refining operator and prove that it is optimal. The learned models can be used for recognizing events from previously unseen videos. We also present an extension to the framework by integrating an abduction step that improves the learning performance when there is noise in the input data. The experimental results on several hours of video data from two challenging real world domains (an airport domain and a physical action verbs domain) suggest that the techniques are suitable to real world scenarios.
引用
收藏
页码:41 / 90
页数:50
相关论文
共 50 条
  • [31] Specific-to-general learning for temporal events with application to learning event definitions from video
    Fern, A
    Givan, R
    Siskind, JM
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2002, 17 : 379 - 449
  • [32] Learning Rules for Semantic Video Event Annotation
    Bertini, Marco
    Del Bimbo, Alberto
    Serra, Giuseppe
    VISUAL INFORMATION SYSTEMS: WEB-BASED VISUAL INFORMATION SEARCH AND MANAGEMENT, VISUAL 2008, 2008, 5188 : 192 - 203
  • [33] Deep Learning Based Video Event Classification
    Gencaslan, Serim
    Utku, Anil
    Akcayol, M. Ali
    JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2023, 26 (03): : 1155 - 1165
  • [34] Building relational world models for reinforcement learning
    Walker, Trevor
    Torrey, Lisa
    Shavlik, Jude
    Maclin, Richard
    INDUCTIVE LOGIC PROGRAMMING, 2008, 4894 : 280 - +
  • [35] Learning 3D appearance models from video
    De la Torre, F
    Casoliva, J
    Cohn, JF
    SIXTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, 2004, : 645 - 650
  • [36] Efficient learning of relational object class models
    Bar Hillel, A
    Hertz, T
    Weinshall, D
    TENTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1762 - 1769
  • [37] From XML to Relational Models
    Castro, Elena
    Cuadra, Dolores
    Velasco, Manuel
    INFORMATICA, 2010, 21 (04) : 505 - 519
  • [38] Multi Agent Learning of Relational Action Models
    Rodrigues, Christophe
    Soldano, Henry
    Bourgne, Gauvain
    Rouveirol, Celine
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1087 - +
  • [39] Optimizing Probabilistic Models for Relational Sequence Learning
    Di Mauro, Nicola
    Basile, Teresa M. A.
    Ferilli, Stefano
    Esposito, Floriana
    FOUNDATIONS OF INTELLIGENT SYSTEMS, 2011, 6804 : 240 - 249
  • [40] Learning directed relational models with recursive dependencies
    Schulte, Oliver
    Khosravi, Hassan
    Man, Tong
    MACHINE LEARNING, 2012, 89 (03) : 299 - 316