Evolving Character-Level DenseNet Architectures Using Genetic Programming

被引:4
作者
Londt, Trevor [1 ]
Gao, Xiaoying [1 ]
Andreae, Peter [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, Wellington, New Zealand
来源
APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2021 | 2021年 / 12694卷
关键词
Character-level DenseNet; Evolutionary deep learning; Genetic programming; Text classification;
D O I
10.1007/978-3-030-72699-7_42
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Densely Connected Convolutional Networks (DenseNet) have demonstrated impressive performance on image classification tasks, but limited research has been conducted on using character-level DenseNet (char-DenseNet) architectures for text classification tasks. It is not clear what DenseNet architectures are optimal for text classification tasks. The iterative task of designing, training and testing of char-DenseNets is a time consuming task that requires expert domain knowledge. Evolutionary deep learning (EDL) has been used to automatically design CNN architectures for the image classification domain, thereby mitigating the need for expert domain knowledge. This study demonstrates the first work on using EDL to evolve char-DenseNet architectures for text classification tasks. A novel genetic programming-based algorithm (GP-Dense) coupled with an indirect-encoding scheme, facilitates the evolution of performant char-DenseNet architectures. The algorithm is evaluated on two popular text datasets, and the best-evolved models are benchmarked against four current state-of-the-art character-level CNN and DenseNet models. Results indicate that the algorithm evolves performant models for both datasets that outperform two of the state-of-the-art models in terms of model accuracy and three of the stateof-the-art models in terms of parameter size.
引用
收藏
页码:665 / 680
页数:16
相关论文
共 21 条
[1]   Emerging Trends Word2Vec [J].
Church, Kenneth Ward .
NATURAL LANGUAGE ENGINEERING, 2017, 23 (01) :155-162
[2]  
Collobert R, 2011, J MACH LEARN RES, V12, P2493
[3]  
Conneau A, 2017, 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, P1107
[4]   Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent [J].
De Sa, Christopher ;
Feldman, Matthew ;
Re, Christopher ;
Olukotun, Kunle .
44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), 2017, :561-574
[5]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[7]  
Gruau F., 1994, Neural network synthesis using cellular encoding and the genetic algorithm
[8]  
Hara K, 2015, IEEE IJCNN
[9]  
He K., 2015, INDIAN J CHEM B
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]