Exploiting detected visual objects for frame-level video filtering

被引:1
作者
Du, Xingzhong [1 ]
Yin, Hongzhi [1 ]
Huang, Zi [1 ]
Yang, Yi [2 ]
Zhou, Xiaofang [1 ,3 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, St Lucia, Qld 4072, Australia
[2] Univ Technol Sydney, Ctr Artificial Intelligence, Ultimo, NSW 2007, Australia
[3] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Jiangsu, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2018年 / 21卷 / 05期
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
Frame-level video filtering; Visual object; Accuracy and effciency evaluation; QUERY; SYSTEM; IMAGE;
D O I
10.1007/s11280-017-0505-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Videos are generated at an unprecedented speed on the web. To improve the efficiency of access, developing new ways to filter the videos becomes a popular research topic. One on-going direction is using visual objects to perform frame-level video filtering. Under this direction, existing works create the unique object table and the occurrence table to maintain the connections between videos and objects. However, the creation process is not scalable and dynamic because it heavily depends on human labeling. To improve this, we propose to use detected visual objects to create these two tables for frame-level video filtering. Our study begins with investigating the existing object detection techniques. After that, we find object detection lacks the identification and connection abilities to accomplish the creation process alone. To supply these abilities, we further investigate three candidates, namely, recognizing-based, matching-based and tracking-based methods, to work with the object detection. Through analyzing the mechanism and evaluating the accuracy, we find that they are imperfect for identifying or connecting the visual objects. Accordingly, we propose a novel hybrid method that combines the matching-based and tracking-based methods to overcome the limitations. Our experiments show that the proposed method achieves higher accuracy and efficiency than the candidate methods. The subsequent analysis shows that the proposed method can efficiently support the frame-level video filtering using visual objects.
引用
收藏
页码:1259 / 1284
页数:26
相关论文
共 44 条
[1]   The Advanced Video Information System: Data structures and query processing [J].
Adali, S ;
Candan, KS ;
Chen, SS ;
Erol, K ;
Subrahmanian, VS .
MULTIMEDIA SYSTEMS, 1996, 4 (04) :172-186
[2]   Robust Tracking-by-Detection using a Detector Confidence Particle Filter [J].
Breitenstein, Michael D. ;
Reichlin, Fabian ;
Leibe, Bastian ;
Koller-Meier, Esther ;
Van Gool, Luc .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :1515-1522
[3]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[4]   BilVideo:: Design and implementation of a video database management system [J].
Dönderler, ME ;
Saykol, E ;
Arslan, U ;
Ulusoy, Ö ;
Güdükbay, U .
MULTIMEDIA TOOLS AND APPLICATIONS, 2005, 27 (01) :79-104
[5]   Rule-based spatiotemporal query processing for video databases [J].
Dönderler, ME ;
Ulusoy, Ö ;
Güdükbay, U .
VLDB JOURNAL, 2004, 13 (01) :86-102
[6]   A rule-based video database system architecture [J].
Dönderler, ME ;
Ulusoy, Ö ;
Güdükbay, U .
INFORMATION SCIENCES, 2002, 143 (1-4) :13-45
[7]   Using Detected Visual Objects to Index Video Database [J].
Du, Xingzhong ;
Yin, Hongzhi ;
Huang, Zi ;
Yang, Yi ;
Zhou, Xiaofang .
DATABASES THEORY AND APPLICATIONS, (ADC 2016), 2016, 9877 :333-345
[8]   QUERY BY IMAGE AND VIDEO CONTENT - THE QBIC SYSTEM [J].
FLICKNER, M ;
SAWHNEY, H ;
NIBLACK, W ;
ASHLEY, J ;
HUANG, Q ;
DOM, B ;
GORKANI, M ;
HAFNER, J ;
LEE, D ;
PETKOVIC, D ;
STEELE, D ;
YANKER, P .
COMPUTER, 1995, 28 (09) :23-32
[9]  
Girshick R., 2014, P IEEE C COMP VIS PA, DOI [10.1109/CVPR.2014.81, DOI 10.1109/CVPR.2014.81, 10.1109/cvpr.2014.81]
[10]  
Hare S, 2011, IEEE I CONF COMP VIS, P263, DOI 10.1109/ICCV.2011.6126251