GENERATING COHERENT NATURAL LANGUAGE ANNOTATIONS FOR VIDEO STREAMS

被引:0
|
作者
Khan, Muhammad Usman Ghani [1 ]
Zhang, Lei [2 ]
Gotoh, Yoshihiko [1 ]
机构
[1] Univ Sheffield, Sheffield, S Yorkshire, England
[2] Harbin Engn Univ, Harbin, Heilongjiang, Peoples R China
来源
2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012) | 2012年
关键词
Video processing; Video annotation; Natural language description; video feature units;
D O I
暂无
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
This contribution addresses generation of natural language annotations for human actions, behaviour and their interactions with other objects observed in video streams. The work starts with implementation of conventional image processing techniques to extract high level features for individual frames. Natural language description of the frame contents is produced based on high level features. Although feature extraction processes are erroneous at various levels, we explore approaches to put them together to produce a coherent description. For extending the approach to description of video streams, units of features are introduced to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. Evaluation is made by calculating ROUGE scores between human annotated and machine generated descriptions.
引用
收藏
页码:2893 / 2896
页数:4
相关论文
共 47 条
  • [21] A PROBABILISTIC PIXEL-BASED APPROACH TO DETECT HUMANS IN VIDEO STREAMS
    Pierard, S.
    Lejeune, A.
    Van Droogenbroeck, M.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 921 - 924
  • [22] Group-of-Picture Mode Acceleration for Efficient Object Detection in Video Streams
    Chen, Kuan-Hung
    IEEE ACCESS, 2023, 11 : 71668 - 71682
  • [23] SEAMLESS ANNOTATION AND ENRICHMENT OF MOBILE CAPTURED VIDEO STREAMS IN REAL-TIME
    El-Saban, Motaz
    Wang, Xin-Jing
    Hasan, Noran
    Bassiouny, Mahmoud
    Refaat, Mahmoud
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [24] A generalized multiple instance learning algorithm for iterative distillation and cross-granular propagation of video annotations
    Kang, Feng
    Naphade, Milind R.
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 769 - +
  • [25] Moroccan Sign Language Video Recognition with Deep Learning
    Boukdir, Abdelbasset
    Benaddy, Mohamed
    El Meslouhi, Othmane
    Kardouchi, Mustapha
    Akhloufi, Moulay
    PROCEEDINGS OF SEVENTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2022, VOL 1, 2023, 447 : 415 - 422
  • [26] mmFilter: Language-Guided Video Analytics at the Edge
    Hu, Zhiming
    Ye, Ning
    Phillips, Caleb
    Capes, Tim
    Mohomed, Iqbal
    PROCEEDINGS OF THE 2020 21ST INTERNATIONAL MIDDLEWARE CONFERENCE INDUSTRIAL TRACK (MIDDLEWARE INDUSTRY '20), 2020, : 1 - 7
  • [27] A Systematic Review of Event-Matching Methods for Complex Event Detection in Video Streams
    Honarparvar, Sepehr
    Ashena, Zahra Bagheri
    Saeedi, Sara
    Liang, Steve
    SENSORS, 2024, 24 (22)
  • [28] Distributed and collaborative real-time vehicle detection and classification over the video streams
    Kul, Seda
    Eken, Suleyman
    Sayar, Ahmet
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2017, 14 (04): : 1 - 12
  • [29] VIDEO. DE: TOWARDS A STUDENT-ORIENTED VIDEO ANNOTATION TOOL AND LANGUAGE LEARNING ENVIRONMENT
    Vranjes, Jelena
    Brone, Geert
    Feyaerts, Kurt
    Paulussen, Hans
    EDULEARN13: 5TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2013, : 5018 - 5025
  • [30] Canny edge detection and Hough transform for high resolution video streams using Hadoop and Spark
    Bilal Iqbal
    Waheed Iqbal
    Nazar Khan
    Arif Mahmood
    Abdelkarim Erradi
    Cluster Computing, 2020, 23 : 397 - 408