How to Keep an Online Learning Chatbot From Being Corrupted

被引：4

作者：

Chai, Yixuan ^{[1
]}

Liu, Guohua ^{[1
]}

Jin, Ziwei ^{[2
]}

Sun, Donghong ^{[3
]}

机构：

[1] Donghua Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China

[2] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH USA

[3] Tsinghua Univ, Inst Network Sci & Cyberspace, Beijing, Peoples R China

来源：

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年

基金：

国家重点研发计划;

关键词：

offensive response; online learning chatbot; reinforcement learning;

D O I：

10.1109/ijcnn48605.2020.9206897

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Online learning can improve chatbots' conversational abilities. Although the online learning method has enhanced the diversity of chatbots' statements, it also brings opportunities for corruption. The chatbot may be corrupted to generate offensive responses such as racist and hate speech. The key component to keeping chatbots from being corrupted is offensive-response detection. Until now, the training datasets for offensive detection have focused only on individual response sentences, disregarding user input sentences. In this paper, we introduce a dialogue-based offensive-response dataset, which consists of 110K input-response chat records. The dataset fills the gap in response detection for chatbots. Then, we build two challenging tasks based on the dataset: an offensive-response detection task and a corrupted chatbot purification task. In addition, we propose a strong benchmark method for the tasks: an encoder-classifier model to detect input-response pairs and a one-shot reinforcement learning (RL) method to reduce rapidly the probability of generating offensive responses.

引用

页数：8

共 27 条

[1] Detecting sentences that may be harmful to children with special needs
Allouch, Merav
Azaria, Amos
Azoulay, Rina
[J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1209 - 1213
[2] [Anonymous], 1992, MEMBRANE HDB, DOI DOI 10.1007/978
[3] [Anonymous], 2012, P 21 ACM INT C INF K, DOI DOI 10.1145/2396761.2398556
[4] [Anonymous], 2011, P INT AAAI C WEB SOC
[5] [Anonymous], DEEP REINFORCEMENT L
[6] Utterance Censorship of Online Reinforcement Learning Chatbot
Chai, Yixuan
Liu, Guohua
[J]. 2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 358 - 362
[7] Chkroun M., 2018, WORKSH 32 AAAI C ART, P695
[8] Dadvar Maral, 2013, Advances in Information Retrieval. 35th European Conference on IR Research, ECIR 2013. Proceedings, P693, DOI 10.1007/978-3-642-36973-5_62
[9] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10] Du JC, 2017, 2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), P445, DOI 10.1109/SPAC.2017.8304320

← 1 2 3 →