Learning a Video-Text Joint Embedding using Korean Tagged Movie Clips

被引:0
作者
Hahm, Gyeong-June [1 ]
Kwak, Chang-Uk [1 ]
Kim, Sun-Joong [1 ]
机构
[1] ETRI, Media Res Div, Daejeon, South Korea
来源
11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020) | 2020年
关键词
joint embedding; text-to-video; video retrieval; neural IR;
D O I
10.1109/ictc49870.2020.9289342
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
For intelligent multimedia services, video contents understanding is a major challenge. In the existing video retrieval approaches, manual descriptive sentence data is necessary for retrieving desired videos against user's search intent. To overcome these limitations, modeling visual concepts included in video and sentence is necessary to learn a mapping of video and text into a common vector space, where relevant videos and texts are close to each other. In this study, we construct a new dataset containing 250 Korean movies with manual text description in Korean. Also, video-text joint embedding model and its quantitative and qualitative search results are introduced. With our proposed model, video manual tagging is no longer necessary for video retrieval services.
引用
收藏
页码:1158 / 1160
页数:3
相关论文
共 6 条
[1]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[2]   Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? [J].
Hara, Kensho ;
Kataoka, Hirokatsu ;
Satoh, Yutaka .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6546-6555
[3]   Interactive Story Maker: Tagged Video Retrieval System for Video Re-creation Service [J].
Kwak, Chang-Uk ;
Han, Min-Ho ;
Kim, Sun-Joong ;
Hahm, Gyeong-June .
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :1270-1271
[4]   HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips [J].
Miech, Antoine ;
Zhukov, Dimitri ;
Alayrac, Jean-Baptiste ;
Tapaswi, Makarand ;
Laptev, Ivan ;
Sivic, Josef .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :2630-2640
[5]   Movie Description [J].
Rohrbach, Anna ;
Torabi, Atousa ;
Rohrbach, Marcus ;
Tandon, Niket ;
Pal, Christopher ;
Larochelle, Hugo ;
Courville, Aaron ;
Schiele, Bernt .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 123 (01) :94-120
[6]  
Wolf T., 2019, CORR