Low-Dimensional Vector Representation Learning for Text Visualization Using Task-Oriented Dialogue Dataset

被引:0
作者
Hwang T. [1 ]
Jung S. [1 ]
Roh Y.-H. [2 ]
机构
[1] Computer Science & Engineering, Chungnam National University, Daejeon
[2] Language Intelligence Research Lab., Electronics and Telecommunications Research Institute, Daejeon
基金
新加坡国家研究基金会;
关键词
Natural language processing; Natural language understanding; Task-oriented dialogue dataset; Text visualization; Vector representation learning;
D O I
10.5626/JCSE.2022.16.3.165
中图分类号
学科分类号
摘要
Text visualization is a complex technique that helps in data understanding and insight, and may lead to loss of information. Through the proposed low-dimensional vector representation learning method, deep learning and visualization through low-dimensional vector space construction were simultaneously performed. This method can transform a taskoriented dialogue dataset into low-dimensional coordinates, and based on this, a deep learning vector space can be constructed. The low-dimensional vector representation deep learning model found the intent of a sentence within a dataset and predicted the sentence components well in 3 out of 5 datasets. In addition, by checking the prediction results in the low-dimensional vector space, it was possible to improve the understanding of the data, such as identifying the structure or errors in the data. © 2022. The Korean Institute of Information Scientists and Engineers
引用
收藏
页码:165 / 177
页数:12
相关论文
共 16 条
[1]  
Bengio Y, Courville A., Vincent P., Representation learning: a review and new perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 8, pp. 1798-1828, (2013)
[2]  
Brad F., Iacob R., Hosu I., Rebedea T., Dataset for a neural natural language interface for databases (NNLIDB), (2017)
[3]  
Van Wijk J. J., The value of visualization, Proceedings of IEEE Visualization, pp. 79-86, (2005)
[4]  
McInnes L., Healy J., Melville J., UMAP: uniform manifold approximation and projection for dimension reduction, (2018)
[5]  
Van der Maaten L., Hinton G., Visualizing data using t-SNE, Journal of Machine Learning Research, 9, pp. 2579-2605, (2008)
[6]  
Chen Y. N., Hakanni-Tur D., Tur G., Celikyilmaz A., Guo J., Deng L., Syntax or semantics? knowledge-guided joint semantic frame parsing, Proceedings of 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 348-355, (2016)
[7]  
Kim J. K., Tur G., Celikyilmaz A., Cao B., Wang Y. Y., Intent detection using semantically enriched word embeddings, Proceedings of 2016 IEEE Spoken Language Technology Workshop (SLT), (2016)
[8]  
Miller G. A., WordNet: An Electronic Lexical Database, (1998)
[9]  
Peters M. E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettlemoyer L., Deep contextualized word representations, (2018)
[10]  
Devlin J., Chang M. W., Lee K., Toutanova K., BERT: pre-training of deep bidirectional transformers for language understanding, (2018)