Role of Context in Unsupervised Sentence Representation Learning: the Case of Dialog Act Tagging

被引:0
作者
Hronsky, Rastislav [1 ]
Keuleers, Emmanuel [2 ]
机构
[1] Jheronimus Acad Data Sci, St Janssingel 92, NL-5211 DA sHertogenbosch, Netherlands
[2] Tilburg Univ, Warandelaan 2, NL-5037 AB Tilburg, Netherlands
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised learning of word representations involves capturing the contextual information surrounding word occurrences, which can be grounded in the observation that word form is largely disconnected from word meaning. While there are fewer reasons to believe that the same holds for sentences, learning through context has been carried over to learning representations of word sequences. However, this work pays minimal to no attention to the role of context in inferring sentence representations. In this article, we present a dialog act tag probing task designed to explicitly compare content-, and context-oriented sentence representations inferred on utterances of telephone conversations (SwDA). Our results suggest that there is no clear benefit of context-based sentence representations over content-based sentence representations. However, there is a very clear benefit of increasing the dimensionality of the sentence vectors in nearly all approaches.
引用
收藏
页码:8784 / 8792
页数:9
相关论文
共 47 条
[1]  
[Anonymous], PROCEEDINGS
[2]  
[Anonymous], 2011 C EMP METH NAT, P151
[3]  
Arora S., 2017, Proceedings of ICLR 2017, P1, DOI DOI 10.1109/ICPEICES.2016.7853305
[4]   Probing Classifiers: Promises, Shortcomings, and Advances [J].
Belinkov, Yonatan .
COMPUTATIONAL LINGUISTICS, 2022, 48 (01) :207-219
[5]   Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[6]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[7]  
Bowman S.R., 2015, C P EMNLP 2015 C EMP, P632
[8]  
Cer D, 2018, CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P169
[9]  
Chen T, 2020, PR MACH LEARN RES, V119
[10]  
Chrupala G, 2019, Arxiv, DOI arXiv:1905.06401