Unsupervised Evaluation of Interactive Dialog with DialoGPT

被引：0

作者：

Mehri, Shikib ^{[1
]}

Eskenazi, Maxine ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Dialog Res Ctr, Language Technol Inst, Pittsburgh, PA 15213 USA

来源：

SIGDIAL 2020: 21ST ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2020) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It is important to define meaningful and interpretable automatic evaluation metrics for open-domain dialog research. Standard language generation metrics have been shown to be ineffective for dialog. This paper introduces the FED metric (fine-grained evaluation of dialog), an automatic evaluation metric which uses DialoGPT, without any fine-tuning or supervision. It also introduces the FED dataset which is constructed by annotating a set of human-system and human-human conversations with eighteen fine-grained dialog qualities. The FED metric (1) does not rely on a ground-truth response, (2) does not require training data and (3) measures fine-grained dialog qualities at both the turn and whole dialog levels. FED attains moderate to strong correlation with human judgement at both levels.

引用

页码：225 / 235

页数：11

共 50 条

[1] Interactive Evaluation of Dialog Track at DSTC9
Mehri, Shikib
Feng, Yulan
Gordon, Carla
Alavi, Seyed Hossein
Traum, David
Eskenazi, Maxine
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5731 - 5738
[2] USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
Mehri, Shikib
Eskenazi, Maxine
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 681 - 707
[3] Evaluation of distance interactive learning in obstetrics and gynaecology (DIALOG)
Jha, V
Duffy, S
McAleer, S
BJOG-AN INTERNATIONAL JOURNAL OF OBSTETRICS AND GYNAECOLOGY, 2002, 109 (04) : 456 - 461
[4] Unsupervised Dialog Structure Learning
Shi, Weiyan
Zhao, Tiancheng
Yu, Zhou
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1797 - 1807
[5] Interactive visual dialog
Arbel, T
Ferrie, FP
IMAGE AND VISION COMPUTING, 2002, 20 (9-10) : 639 - 646
[6] RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems
Tao, Chongyang
Mou, Lili
Zhao, Dongyan
Yan, Rui
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 722 - 729
[7] Unsupervised Learning of Interpretable Dialog Models
Madan, Dhiraj
Raghu, Dinesh
Pandey, Gaurav
Joshi, Sachindra
ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2101 - 2107
[8] STRUCTURE AND CONTENT IN INTERACTIVE DIALOG
LONG, JB
HAMMOND, NV
BARNARD, P
MORTON, J
CLARK, IA
ERGONOMICS, 1981, 24 (03) : 230 - 230
[9] Interactive Video Retrieval with Dialog
Maeoki, Sho
Uehara, Kohei
Harada, Tatsuya
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4091 - 4099
[10] A dialog based interactive robot
Sagerer, G
BUILDING THE INFORMATION SOCIETY, 2004, 156 : 749 - 750

← 1 2 3 4 5 →