Low-Dimensional Vector Representation Learning for Text Visualization Using Task-Oriented Dialogue Dataset

被引：0

作者：

Hwang T. ^{[1
]}

Jung S. ^{[1
]}

Roh Y.-H. ^{[2
]}

机构：

[1] Computer Science & Engineering, Chungnam National University, Daejeon

[2] Language Intelligence Research Lab., Electronics and Telecommunications Research Institute, Daejeon

来源：

Journal of Computing Science and Engineering | 2022年 / 16卷 / 03期

基金：

新加坡国家研究基金会;

关键词：

Natural language processing; Natural language understanding; Task-oriented dialogue dataset; Text visualization; Vector representation learning;

D O I：

10.5626/JCSE.2022.16.3.165

中图分类号：

学科分类号：

摘要：

Text visualization is a complex technique that helps in data understanding and insight, and may lead to loss of information. Through the proposed low-dimensional vector representation learning method, deep learning and visualization through low-dimensional vector space construction were simultaneously performed. This method can transform a taskoriented dialogue dataset into low-dimensional coordinates, and based on this, a deep learning vector space can be constructed. The low-dimensional vector representation deep learning model found the intent of a sentence within a dataset and predicted the sentence components well in 3 out of 5 datasets. In addition, by checking the prediction results in the low-dimensional vector space, it was possible to improve the understanding of the data, such as identifying the structure or errors in the data. © 2022. The Korean Institute of Information Scientists and Engineers

引用

页码：165 / 177

页数：12

共 16 条

[1]

Bengio Y, Courville A., Vincent P., Representation learning: a review and new perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 8, pp. 1798-1828, (2013)

[2]

Brad F., Iacob R., Hosu I., Rebedea T., Dataset for a neural natural language interface for databases (NNLIDB), (2017)

[3]

Van Wijk J. J., The value of visualization, Proceedings of IEEE Visualization, pp. 79-86, (2005)

[4]

McInnes L., Healy J., Melville J., UMAP: uniform manifold approximation and projection for dimension reduction, (2018)

[5]

Van der Maaten L., Hinton G., Visualizing data using t-SNE, Journal of Machine Learning Research, 9, pp. 2579-2605, (2008)

[6]

Chen Y. N., Hakanni-Tur D., Tur G., Celikyilmaz A., Guo J., Deng L., Syntax or semantics? knowledge-guided joint semantic frame parsing, Proceedings of 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 348-355, (2016)

[7]

Kim J. K., Tur G., Celikyilmaz A., Cao B., Wang Y. Y., Intent detection using semantically enriched word embeddings, Proceedings of 2016 IEEE Spoken Language Technology Workshop (SLT), (2016)

[8]

Miller G. A., WordNet: An Electronic Lexical Database, (1998)

[9]

Peters M. E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettlemoyer L., Deep contextualized word representations, (2018)

[10]

Devlin J., Chang M. W., Lee K., Toutanova K., BERT: pre-training of deep bidirectional transformers for language understanding, (2018)

← 1 2 →