Video scene analysis: an overview and challenges on deep learning algorithms

被引：0

作者：

Qaisar Abbas

Mostafa E. A. Ibrahim

M. Arfan Jaffar

机构：

[1] Al Imam Muhamad Ibn Saud Islamic University,Department of Computer Science

[2] Benha University,Benha Faculty of Engineering

来源：

Multimedia Tools and Applications | 2018年 / 77卷

关键词：

Deep learning; Computer vision; Video processing; Activity classification; Scene interpretation; Video description; Video captioning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Video scene analysis is a recent research topic due to its vital importance in many applications such as real-time vehicle activity tracking, pedestrian detection, surveillance, and robotics. Despite its popularity, the video scene analysis is still an open challenging task and require more accurate algorithms. However, the advances in deep learning algorithms for video scene analysis have been emerged in last few years for solving the problem of real-time processing. In this paper, a review of the recent developments in deep learning and video scene analysis problems is presented. In addition, this paper also briefly describes the most recent used datasets along with their limitations. Moreover, this review provides a detailed overview of the particular challenges existed in real-time video scene analysis that has been contributed towards activity recognition, scene interpretation, and video description/captioning. Finally, the paper summarizes the future trends and challenges in video scene analysis tasks and our insights are provided to inspire further research efforts.

引用

页码：20415 / 20453

页数：38

共 215 条

[41]

Lew MS(2017)Detecting anomalous events in videos by learning deep representations of appearance and motion Elsevier J Comput Vis Image Underst 156 117-127

[42]

Hasan M(2015)Deep learning driven Blockwise moving object detection with binary scene modeling J Neurocomputing 168 454-463

[43]

Roy-Chowdhury AK(2016)A load-aware pluggable cloud framework for real-time video processing IEEE Trans Industrial Inf 12 2166-2176

[44]

Hinton GE(2016)Deep fusion of multiple semantic cues for complex event recognition IEEE Trans Image Proc. 25 1033-1046

[45]

Hinton G(2016)Learning relevance restricted Boltzmann machine for unstructured group activity and event understanding Int J Comput Vis 119 329-345

[46]

Deng L(2015)Learning collective crowd behaviors with dynamic pedestrian-agents Int J Comput Vis 111 50-68

[47]

Yu D(2016)From handcrafted to learned representations for human action recognition: a survey J Image Vis Comput 55 42-52

[48]

Dahl GE(2016)Learning from multiple sources for video summarisation Int J Comput Vis 117 247-268

[49]

Mohamed RA(2013)Hierarchical and incremental event learning approach based on concept formation models J of Neurocomputing 100 3-18

[50]

Jaitly N(undefined)undefined undefined undefined undefined-undefined

← 1 2 3 4 5 6 7 8 9 10 →