A study on word vector dimensions for sentence classifications using convolutional neural networks

被引:0
作者
Takuya S. [1 ]
Satoshi Y. [2 ]
机构
[1] Graduate School of Information and Computer Science, Chiba Institute of Technology, 2-17-1, Tsudanuma, Narashino, Chiba
[2] Dept. of Computer Science, Chiba Institute of Technology, 2-17-1, Tsudanuma, Narashino, Chiba
关键词
Convolutional neural networks; Distributed representation of words; Natural language processing; Sentence classification; Word2vec;
D O I
10.1541/ieejeiss.139.1066
中图分类号
学科分类号
摘要
Recently, convolutional neural networks (CNNs) have achieved remarkable results on sentence classification problems. In these approaches, each word in the sentences is transformed to real number vectors (called word vectors) and the sentences as input data to the CNN are represented by the sequences of the word vectors. A dataset for training and testing for the CNN includes the large number of words, therefore the word vectors are embedded so high-dimentional space. As a result of this, the input data space of the CNN becomes very high. When the input data have high dimension, much training data are required for enough training of the CNN. It is not always possible, however, to get enough number of data for training. If the enough data cannot prepare for learning, it is desirable to decrease the dimension of input data. This paper shows the results that the smaller dimensional word vectors are applied to sentence classifications by CNNs. The results have shown that some dimensionality reduction does not effect too much to the accuracy of the sentence classifications by CNNs. © 2019 The Institute of Electrical Engineers of Japan.
引用
收藏
页码:1066 / 1079
页数:13
相关论文
共 50 条
[1]   Nursing-care Text Classification using Word Vector Representation and Convolutional Neural Networks [J].
Nii, Manabu ;
Tsuchida, Yuya ;
Kato, Yusuke ;
Uchinuno, Atsuko ;
Sakashita, Reiko .
2017 JOINT 17TH WORLD CONGRESS OF INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (IFSA-SCIS), 2017,
[2]   An Analysis of Convolutional Neural Networks for Sentence Classification [J].
Albuquerque Vieira, Joao Paulo ;
Moura, Raimundo Santos .
2017 XLIII LATIN AMERICAN COMPUTER CONFERENCE (CLEI), 2017,
[3]   Sentence Similarity Measurement with Convolutional Neural Networks Using Semantic and Syntactic Features [J].
Zhang, Shiru ;
Liang, Zhiyao ;
Lin, Jian .
CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 63 (02) :943-957
[4]   VC dimensions of group convolutional neural networks [J].
Petersen, Philipp Christian ;
Sepliarskaia, Anna .
NEURAL NETWORKS, 2024, 169 :462-474
[5]   A concurrent prediction of criminal law charge and sentence using twin convolutional neural networks [J].
Juang, Tong-Ying ;
Hsu, Chih-Shun ;
Chen, Yuh-Shyan ;
Chen, Wan-Chun .
INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2022, 41 (01) :29-43
[6]   Learning Word and Sentence Embeddings Using a Generative Convolutional Network [J].
Vargas-Ocampo, Edgar ;
Roman-Rangel, Edgar ;
Hermosillo-Valadez, Jorge .
PATTERN RECOGNITION, 2018, 10880 :135-144
[7]   Vector-kernel convolutional neural networks [J].
Ou, Jun ;
Li, Yujian .
NEUROCOMPUTING, 2019, 330 :253-258
[8]   Arabic Question Classification Using Support Vector Machines and Convolutional Neural Networks [J].
Aouichat, Asma ;
Ameur, Mohamed Seghir Hadj ;
Geussoum, Ahmed .
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2018), 2018, 10859 :113-125
[9]   Sentence recognition using artificial neural networks [J].
Majewski, Maciej ;
Zurada, Jacek M. .
KNOWLEDGE-BASED SYSTEMS, 2008, 21 (07) :629-635
[10]   Handwritten English Word Recognition based on Convolutional Neural Networks [J].
Yuan, Aiquan ;
Bai, Gang ;
Yang, Po ;
Guo, Yanni ;
Zhao, Xinting .
13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, :207-212