GENERATING COHERENT NATURAL LANGUAGE ANNOTATIONS FOR VIDEO STREAMS

被引:0
|
作者
Khan, Muhammad Usman Ghani [1 ]
Zhang, Lei [2 ]
Gotoh, Yoshihiko [1 ]
机构
[1] Univ Sheffield, Sheffield, S Yorkshire, England
[2] Harbin Engn Univ, Harbin, Heilongjiang, Peoples R China
来源
2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012) | 2012年
关键词
Video processing; Video annotation; Natural language description; video feature units;
D O I
暂无
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
This contribution addresses generation of natural language annotations for human actions, behaviour and their interactions with other objects observed in video streams. The work starts with implementation of conventional image processing techniques to extract high level features for individual frames. Natural language description of the frame contents is produced based on high level features. Although feature extraction processes are erroneous at various levels, we explore approaches to put them together to produce a coherent description. For extending the approach to description of video streams, units of features are introduced to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. Evaluation is made by calculating ROUGE scores between human annotated and machine generated descriptions.
引用
收藏
页码:2893 / 2896
页数:4
相关论文
共 47 条
  • [31] Canny edge detection and Hough transform for high resolution video streams using Hadoop and Spark
    Iqbal, Bilal
    Iqbal, Waheed
    Khan, Nazar
    Mahmood, Arif
    Erradi, Abdelkarim
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (01): : 397 - 408
  • [32] Can bite detection algorithms substitute manual video annotations in elderly people with and without Parkinson's disease? An experimental study
    Ioakeimidis, Ioannis
    Konstantinidis, Dimitrios
    Fagerberg, Petter
    Klingelhoefer, Lisa
    Langlet, Billy
    Rotter, Eva
    Spolander, Sofia
    Dimitropoulos, Kosmas
    17TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2024, 2024, : 554 - 561
  • [33] Cognitive Modeling of the Natural Behavior of the Varroa destructor Mite on Video
    Ramirez-Bogantes, Melvin
    Prendas-Rojas, Juan P.
    Figueroa-Mata, Geovanni
    Calderon, Rafael A.
    Salas-Huertas, Oscar
    Travieso, Carlos M.
    COGNITIVE COMPUTATION, 2017, 9 (04) : 482 - 493
  • [34] Cognitive Modeling of the Natural Behavior of the Varroa destructor Mite on Video
    Melvin Ramírez-Bogantes
    Juan P. Prendas-Rojas
    Geovanni Figueroa-Mata
    Rafael A. Calderon
    Oscar Salas-Huertas
    Carlos M. Travieso
    Cognitive Computation, 2017, 9 : 482 - 493
  • [35] Studying Natural User Interfaces for Smart Video Annotation towards Ubiquitous Environments
    Rodrigues, Rui
    Madeira, Rui Neves
    Correia, Nuno
    PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE ON MOBILE AND UBIQUITOUS MULTIMEDIA, MUM 2021, 2021, : 158 - 168
  • [36] NATURAL LANGUAGE DESCRIPTION OF REMOTE SENSING IMAGES BASED ON DEEP LEARNING
    Zhang, Xiangrong
    Li, Xiang
    An, Jinliang
    Gao, Li
    Hou, Biao
    Li, Chen
    2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 4798 - 4801
  • [37] Comparative video annotation and visual literacy: performance analysis of Rina Yerushalmi's theatre language
    Aronson-Lehavi, Sharon
    Skop, Natan
    Dorembus, Yael Via
    INTERNATIONAL JOURNAL OF PERFORMANCE ARTS AND DIGITAL MEDIA, 2021, 17 (01) : 86 - 101
  • [38] Isolated Video-Based Arabic Sign Language Recognition Using Convolutional and Recursive Neural Networks
    Boukdir, Abdelbasset
    Benaddy, Mohamed
    Ellahyani, Ayoub
    El Meslouhi, Othmane
    Kardouchi, Mustapha
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (02) : 2187 - 2199
  • [39] Isolated Video-Based Arabic Sign Language Recognition Using Convolutional and Recursive Neural Networks
    Abdelbasset Boukdir
    Mohamed Benaddy
    Ayoub Ellahyani
    Othmane El Meslouhi
    Mustapha Kardouchi
    Arabian Journal for Science and Engineering, 2022, 47 : 2187 - 2199
  • [40] Fusion-Attention Network for person search with free-form natural language
    Ji, Zhong
    Li, Shengjia
    Pang, Yanwei
    PATTERN RECOGNITION LETTERS, 2018, 116 : 205 - 211