How to Keep an Online Learning Chatbot From Being Corrupted

被引:4
作者
Chai, Yixuan [1 ]
Liu, Guohua [1 ]
Jin, Ziwei [2 ]
Sun, Donghong [3 ]
机构
[1] Donghua Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
[2] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH USA
[3] Tsinghua Univ, Inst Network Sci & Cyberspace, Beijing, Peoples R China
来源
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年
基金
国家重点研发计划;
关键词
offensive response; online learning chatbot; reinforcement learning;
D O I
10.1109/ijcnn48605.2020.9206897
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online learning can improve chatbots' conversational abilities. Although the online learning method has enhanced the diversity of chatbots' statements, it also brings opportunities for corruption. The chatbot may be corrupted to generate offensive responses such as racist and hate speech. The key component to keeping chatbots from being corrupted is offensive-response detection. Until now, the training datasets for offensive detection have focused only on individual response sentences, disregarding user input sentences. In this paper, we introduce a dialogue-based offensive-response dataset, which consists of 110K input-response chat records. The dataset fills the gap in response detection for chatbots. Then, we build two challenging tasks based on the dataset: an offensive-response detection task and a corrupted chatbot purification task. In addition, we propose a strong benchmark method for the tasks: an encoder-classifier model to detect input-response pairs and a one-shot reinforcement learning (RL) method to reduce rapidly the probability of generating offensive responses.
引用
收藏
页数:8
相关论文
共 27 条
  • [1] Detecting sentences that may be harmful to children with special needs
    Allouch, Merav
    Azaria, Amos
    Azoulay, Rina
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1209 - 1213
  • [2] [Anonymous], 1992, MEMBRANE HDB, DOI DOI 10.1007/978
  • [3] [Anonymous], 2012, P 21 ACM INT C INF K, DOI DOI 10.1145/2396761.2398556
  • [4] [Anonymous], 2011, P INT AAAI C WEB SOC
  • [5] [Anonymous], DEEP REINFORCEMENT L
  • [6] Utterance Censorship of Online Reinforcement Learning Chatbot
    Chai, Yixuan
    Liu, Guohua
    [J]. 2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 358 - 362
  • [7] Chkroun M., 2018, WORKSH 32 AAAI C ART, P695
  • [8] Dadvar Maral, 2013, Advances in Information Retrieval. 35th European Conference on IR Research, ECIR 2013. Proceedings, P693, DOI 10.1007/978-3-642-36973-5_62
  • [9] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [10] Du JC, 2017, 2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), P445, DOI 10.1109/SPAC.2017.8304320