BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment

被引：0

作者：

Lebron, Luis ^{[1
]}

Graham, Yvette ^{[2
]}

McGuinness, Kevin ^{[1
]}

Kouramas, Konstantinos ^{[3
]}

O'Connor, Noel E. ^{[1
]}

机构：

[1] Dublin City Univ, Insight SFI Res Ctr Data Analyt, Dublin, Ireland

[2] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin, Ireland

[3] Collins Aerosp, Charlotte, NC USA

来源：

LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2022年

基金：

爱尔兰科学基金会;

关键词：

Video captioning; NLP; deep learning; learned metric;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Evaluating video captioning systems is a challenging task with multiple challenges to consider. Firstly, the fluency of the caption, multiple actions taking place within a single scene, and estimation of what a human user might consider important in a video. Most metrics aim to measure how similar the system generated captions are to a single or a set of human-generated captions. This paper presents a new method based on a deep learning model to evaluate systems. The model is based on BERT language model, shown to work well across a range of NLP tasks. The aim is for the model to learn to perform an evaluation similar to that of a human. To do so, we use a dataset that contains human evaluation of system-generated captions. The dataset consists of human judgments of the quality of captions produced by the system participating in past TRECVid video to text tasks (Smeaton et al., 2006). These annotations will be made publicly available.The new video captioning evaluation metric, BERTHA, obtains favourable results, outperforming commonly applied metrics in some setups.

引用

页码：1566 / 1575

页数：10

共 27 条

[1] SPICE: Semantic Propositional Image Caption Evaluation
Anderson, Peter
Fernando, Basura
Johnson, Mark
Gould, Stephen
[J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 382 - 398
[2] [Anonymous], 2006, P 8 ACM SIGMM INT WO, DOI DOI 10.1145/1178677.1178722
[3] Awad G., 2021, P TRECVID 2020
[4] Awad G, 2017, P TRECVID WORKSH
[5] Awad G., 2020, ARXIV200909984
[6] Awad George, 2018, P TRECVID 2018 GAITH
[7] Banerjee S., 2005, P ACL WORKSH INTR EX, P65
[8] Brown T. B., 2020, P 34 INT C NEUR INF
[9] Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
[10] Chen XL., 2015, CORR, V1504, P00325

← 1 2 3 →