Unsupervised Evaluation of Interactive Dialog with DialoGPT

被引:0
|
作者
Mehri, Shikib [1 ]
Eskenazi, Maxine [1 ]
机构
[1] Carnegie Mellon Univ, Dialog Res Ctr, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is important to define meaningful and interpretable automatic evaluation metrics for open-domain dialog research. Standard language generation metrics have been shown to be ineffective for dialog. This paper introduces the FED metric (fine-grained evaluation of dialog), an automatic evaluation metric which uses DialoGPT, without any fine-tuning or supervision. It also introduces the FED dataset which is constructed by annotating a set of human-system and human-human conversations with eighteen fine-grained dialog qualities. The FED metric (1) does not rely on a ground-truth response, (2) does not require training data and (3) measures fine-grained dialog qualities at both the turn and whole dialog levels. FED attains moderate to strong correlation with human judgement at both levels.
引用
收藏
页码:225 / 235
页数:11
相关论文
共 50 条
  • [1] Interactive Evaluation of Dialog Track at DSTC9
    Mehri, Shikib
    Feng, Yulan
    Gordon, Carla
    Alavi, Seyed Hossein
    Traum, David
    Eskenazi, Maxine
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5731 - 5738
  • [2] USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
    Mehri, Shikib
    Eskenazi, Maxine
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 681 - 707
  • [3] Evaluation of distance interactive learning in obstetrics and gynaecology (DIALOG)
    Jha, V
    Duffy, S
    McAleer, S
    BJOG-AN INTERNATIONAL JOURNAL OF OBSTETRICS AND GYNAECOLOGY, 2002, 109 (04) : 456 - 461
  • [4] Unsupervised Dialog Structure Learning
    Shi, Weiyan
    Zhao, Tiancheng
    Yu, Zhou
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1797 - 1807
  • [5] Interactive visual dialog
    Arbel, T
    Ferrie, FP
    IMAGE AND VISION COMPUTING, 2002, 20 (9-10) : 639 - 646
  • [6] RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems
    Tao, Chongyang
    Mou, Lili
    Zhao, Dongyan
    Yan, Rui
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 722 - 729
  • [7] Unsupervised Learning of Interpretable Dialog Models
    Madan, Dhiraj
    Raghu, Dinesh
    Pandey, Gaurav
    Joshi, Sachindra
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2101 - 2107
  • [8] STRUCTURE AND CONTENT IN INTERACTIVE DIALOG
    LONG, JB
    HAMMOND, NV
    BARNARD, P
    MORTON, J
    CLARK, IA
    ERGONOMICS, 1981, 24 (03) : 230 - 230
  • [9] Interactive Video Retrieval with Dialog
    Maeoki, Sho
    Uehara, Kohei
    Harada, Tatsuya
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4091 - 4099
  • [10] A dialog based interactive robot
    Sagerer, G
    BUILDING THE INFORMATION SOCIETY, 2004, 156 : 749 - 750