Language Models as Context-sensitive Word Search Engines

被引:0
作者
Wiegmann, Matti [1 ]
Voelske, Michael [1 ]
Stein, Benno [1 ]
Potthast, Martin [2 ]
机构
[1] Bauhaus Univ Weimar, Weimar, Germany
[2] Univ Leipzig, Leipzig, Germany
来源
PROCEEDINGS OF THE FIRST WORKSHOP ON INTELLIGENT AND INTERACTIVE WRITING ASSISTANTS (IN2WRITING 2022) | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Context-sensitive word search engines are writing assistants that support word choice, phrasing, and idiomatic language use by indexing large-scale n-gram collections and implementing a wildcard search. However, search results become unreliable with increasing context size (e.g., n >= 5), when observations become sparse. This paper proposes two strategies for word search with larger n, based on masked and conditional language modeling. We build such search engines using BERT and BART and compare their capabilities in answering English context queries with those of the n-gram-based word search engine Netspeak. Our proposed strategies score within 5 percentage points MRR of n-gram collections while answering up to 5 times as many queries.(1)
引用
收藏
页码:39 / 45
页数:7
相关论文
共 21 条
[1]  
Alikaniotis Dimitris, 2019, P 14 WORKSHOP INNOVA
[2]   Design and Evaluation of WriteBetter: A Corpus-Based Writing Assistant [J].
Bellino, Alessio ;
Bascunan, Daniela .
IEEE ACCESS, 2020, 8 :70216-70233
[3]  
Boisson Joanne, 2013, P 51 ANN M ASS COMP, P139
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Donahue C, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P2492
[6]   Cumulated gain-based evaluation of IR techniques [J].
Järvelin, K ;
Kekäläinen, J .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) :422-446
[7]  
Lee Mina, 2021, P 2021 C N AM CHAPT, P4362
[8]  
Lewis Mike., 2019, arXiv
[9]   Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods [J].
Madnani, Nitin ;
Dorr, Bonnie J. .
COMPUTATIONAL LINGUISTICS, 2010, 36 (03) :341-387
[10]  
McCarthy Diana, 2007, P 4 INT WORKSHOP SEM, P48