REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning

被引:0
|
作者
Jiang, Ming [1 ]
Hu, Junjie [2 ]
Huang, Qiuyuan [3 ]
Zhang, Lei [3 ]
Diesner, Jana [1 ]
Gao, Jianfeng [3 ]
机构
[1] Univ Lllinois Urbana Champaign, Champaign, IL 61820 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Microsoft Res, Redmond, WA USA
关键词
GENERATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Popular metrics used for evaluating image captioning systems, such as BLEU and CIDEr, provide a single score to gauge the system's overall effectiveness. This score is often not informative enough to indicate what specific errors are made by a given system. In this study, we present a fine-grained evaluation method REO for automatically measuring the performance of image captioning systems. REO assesses the quality of captions from three perspectives: 1) Relevance to the ground truth, 2) Extraness of the content that is irrelevant to the ground truth, and 3) Omission of the elements in the images and human references. Experiments on three benchmark datasets demonstrate that our method achieves a higher consistency with human judgments and provides more intuitive evaluation results than alternative metrics.(1)
引用
收藏
页码:1475 / 1480
页数:6
相关论文
共 50 条
  • [11] Context-Aware Visual Policy Network for Fine-Grained Image Captioning
    Zha, Zheng-Jun
    Liu, Daqing
    Zhang, Hanwang
    Zhang, Yongdong
    Wu, Feng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 710 - 722
  • [12] Image Difference Captioning With Instance-Level Fine-Grained Feature Representation
    Huang, Qingbao
    Liang, Yu
    Wei, Jielong
    Yi, Cai
    Liang, Hanyu
    Leung, Ho-fung
    Li, Qing
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2004 - 2017
  • [13] Fine-grained person-based image captioning via advanced spectrum parsing
    Wu, Jianhui
    Ni, Fan
    Wang, Zijie
    Ju, Haoyu
    Zhang, Yue
    Hu, Fangqiang
    Li, Yifeng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (11) : 34015 - 34030
  • [14] Integration of textual cues for fine-grained image captioning using deep CNN and LSTM
    Gupta, Neeraj
    Jalal, Anand Singh
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (24): : 17899 - 17908
  • [15] Integration of textual cues for fine-grained image captioning using deep CNN and LSTM
    Neeraj Gupta
    Anand Singh Jalal
    Neural Computing and Applications, 2020, 32 : 17899 - 17908
  • [16] Fine-grained person-based image captioning via advanced spectrum parsing
    Jianhui Wu
    Fan Ni
    Zijie Wang
    Haoyu Ju
    Yue Zhang
    Fangqiang Hu
    Yifeng Li
    Multimedia Tools and Applications, 2024, 83 : 34015 - 34030
  • [17] Evaluation of Output Embeddings for Fine-Grained Image Classification
    Akata, Zeynep
    Reed, Scott
    Walter, Daniel
    Lee, Honglak
    Schiele, Bernt
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 2927 - 2936
  • [18] Fine-Grained Image Search
    Xie, Lingxi
    Wang, Jingdong
    Zhang, Bo
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (05) : 636 - 647
  • [19] High-Quality Image Captioning With Fine-Grained and Semantic-Guided Visual Attention
    Zhang, Zongjian
    Wu, Qiang
    Wang, Yang
    Chen, Fang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (07) : 1681 - 1693
  • [20] CASCADE ATTENTION FUSION FOR FINE-GRAINED IMAGE CAPTIONING BASED ON MULTI-LAYER LSTM
    Wang, Shuang
    Meng, Yun
    Gu, Yu
    Zhang, Lei
    Ye, Xiutiao
    Tian, Jingxian
    Jiao, Licheng
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2245 - 2249