Deep Learning Based Semantic Video Indexing and Retrieval

被引:3
作者
Podlesnaya, Anna [1 ]
Podlesnyy, Sergey [1 ]
机构
[1] Cinema & Photo Res Inst NIKFI, Creat Prod Assoc Gorky Film Studio, Moscow, Russia
来源
PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 2 | 2018年 / 16卷
关键词
Video indexing; Video retrieval; Shot boundary detection; Graph database; Semantic features; Convolutional neural networks; Deep learning; MPEG-7;
D O I
10.1007/978-3-319-56991-8_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vast amount of video stored in web archives makes their retrieval based on manual text annotations impractical. This study presents a video retrieval system capitalizing on image recognition techniques. The article discloses the details of implementation and empirical evaluation results for the system entirely based on features, extracted by convolutional neural networks. It is shown that these features can serve as universal signatures of the semantic content of the video and can be useful for implementing several types of multimedia retrieval queries defined in MPEG-7 standard. Further, the graph-based structure of the video index storage is proposed in order to efficiently implement complicated spatial and temporal search queries. Thus, technical approaches proposed in this work may help to build cost-efficient and user-friendly multimedia retrieval system.
引用
收藏
页码:359 / 372
页数:14
相关论文
共 14 条
[1]  
[Anonymous], 2011, ESANN
[2]  
[Anonymous], IEEE INT C IM PROC I
[3]  
[Anonymous], 2007, Vowpal wabbit online learning project
[4]  
[Anonymous], 2014, ARXIV14122306
[5]  
[Anonymous], 2011, ICCV
[6]  
[Anonymous], 2014, CORR
[7]  
[Anonymous], 1593852003 ISOIEC
[8]  
[Anonymous], 2015, SHORT SNIPPETS DEEP
[9]  
Bangalore S., 2013, US Patent, Patent No. 8487984
[10]   Caffe: Convolutional Architecture for Fast Feature Embedding [J].
Jia, Yangqing ;
Shelhamer, Evan ;
Donahue, Jeff ;
Karayev, Sergey ;
Long, Jonathan ;
Girshick, Ross ;
Guadarrama, Sergio ;
Darrell, Trevor .
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, :675-678