BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment

被引:0
作者
Lebron, Luis [1 ]
Graham, Yvette [2 ]
McGuinness, Kevin [1 ]
Kouramas, Konstantinos [3 ]
O'Connor, Noel E. [1 ]
机构
[1] Dublin City Univ, Insight SFI Res Ctr Data Analyt, Dublin, Ireland
[2] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin, Ireland
[3] Collins Aerosp, Charlotte, NC USA
来源
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2022年
基金
爱尔兰科学基金会;
关键词
Video captioning; NLP; deep learning; learned metric;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Evaluating video captioning systems is a challenging task with multiple challenges to consider. Firstly, the fluency of the caption, multiple actions taking place within a single scene, and estimation of what a human user might consider important in a video. Most metrics aim to measure how similar the system generated captions are to a single or a set of human-generated captions. This paper presents a new method based on a deep learning model to evaluate systems. The model is based on BERT language model, shown to work well across a range of NLP tasks. The aim is for the model to learn to perform an evaluation similar to that of a human. To do so, we use a dataset that contains human evaluation of system-generated captions. The dataset consists of human judgments of the quality of captions produced by the system participating in past TRECVid video to text tasks (Smeaton et al., 2006). These annotations will be made publicly available.The new video captioning evaluation metric, BERTHA, obtains favourable results, outperforming commonly applied metrics in some setups.
引用
收藏
页码:1566 / 1575
页数:10
相关论文
共 27 条
  • [1] SPICE: Semantic Propositional Image Caption Evaluation
    Anderson, Peter
    Fernando, Basura
    Johnson, Mark
    Gould, Stephen
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 382 - 398
  • [2] [Anonymous], 2006, P 8 ACM SIGMM INT WO, DOI DOI 10.1145/1178677.1178722
  • [3] Awad G., 2021, P TRECVID 2020
  • [4] Awad G, 2017, P TRECVID WORKSH
  • [5] Awad G., 2020, ARXIV200909984
  • [6] Awad George, 2018, P TRECVID 2018 GAITH
  • [7] Banerjee S., 2005, P ACL WORKSH INTR EX, P65
  • [8] Brown T. B., 2020, P 34 INT C NEUR INF
  • [9] Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
  • [10] Chen XL., 2015, CORR, V1504, P00325