Customizable text generation via conditional text generative adversarial network

被引:24
作者
Chen, Jinyin [1 ]
Wu, Yangyang [1 ]
Jia, Chengyu [1 ]
Zheng, Haibin [1 ]
Huang, Guohan [1 ]
机构
[1] Zhejiang Univ Technol, Coll Informat Engn, Hangzhou 310023, Peoples R China
关键词
Text generation; Variable length; Emotion label; Conditional text generative adversarial network; SYSTEMS;
D O I
10.1016/j.neucom.2018.12.092
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatically generating meaningful and coherent text has many applications, such as machine translation, dialogue systems, BOT application, etc. Text generation technology has attracted more attention over the past decades. A bunch of excellent methods are proposed; however, there are still challenges to generate text rivals the real one by human, such as most machines output fixed length text, or can only generate text quite the same with the input training text. In this paper, we put forward a novel text generation system, called customizable conditional text generative adversarial network, which is capable of generating diverse text content of variable length with customizable emotion label. It is more convenient for generating actual original text with specific sensitive orientation. We propose a conditional text generative adversarial network (CTGAN), in which emotion label is adopted as an input channel to specify the output text, and variable length text generation strategy is put forward. After generating initial texts by CTGAN, to make the generated text data match the real scene, we design an automated word-level replacement strategy, which extracts the keywords (e.g. nouns) from the training texts and replaces the specific keywords in the generated texts. Finally, we design a comprehensive evaluation metric based on various text evaluations, called mixed evaluation metric. Comprehensive experiments on real-world datasets testify that our proposed CTGAN behaves better than other text generation methods, i.e., generated text are more real compared with the real text than other generation methods, achieving state-of-the-art generation performance. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:125 / 135
页数:11
相关论文
共 48 条
[1]  
Abdul-Kader SA, 2015, INT J ADV COMPUT SC, V6, P72
[2]  
[Anonymous], 1995, Chapman & Hall/CRCinterdisciplinary statistics series
[3]  
[Anonymous], 2013, P 2013 C EMPIRICAL M
[4]  
[Anonymous], 2018, INT C LEARN REPR
[5]  
Baccianella S, 2010, LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
[6]   ARE LEXICAL DECISIONS A GOOD MEASURE OF LEXICAL ACCESS - THE ROLE OF WORD-FREQUENCY IN THE NEGLECTED DECISION STAGE [J].
BALOTA, DA ;
CHUMBLEY, JI .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1984, 10 (03) :340-357
[7]  
Bartoli A, 2016, 2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2016), P721, DOI [10.1109/WI.2016.0130, 10.1109/WI.2016.129]
[8]  
Bellegarda J. R., 2014, U.S. Patent, Patent No. [8,719,006, 8719006]
[9]   Syntactic clustering of the Web [J].
Broder, AZ ;
Glassman, SC ;
Manasse, MS ;
Zweig, G .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1997, 29 (8-13) :1157-1166
[10]  
Cho Kyunghyun, 2014, P 2014 C EMP METH NA, P1724