A BERT-based Idiom Detection Model

被引:0
作者
Gamage, Gihan [1 ]
De Silva, Daswin [1 ]
Adikari, Achini [1 ]
Alahakoon, Damminda [1 ]
机构
[1] La Trobe Univ, Ctr Data Analyt & Cognit CDAC, Melbourne, Vic, Australia
来源
2022 15TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI) | 2022年
关键词
Idioms; Natural Language Processing; BERT;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Idioms are figures of speech that contradict the principle of compositionality. This disposition of idioms can misdirect Natural Language Processing (NLP) techniques, which mostly focus on the literal meaning of terms. In this paper, we propose a novel idiom detection model that distinguishes between literal and idiomatic expressions. It utilizes a token classification approach to fine-tune BERT(Bidirectional Encoder Representations from Transformers). It is empirically evaluated on four idiom datasets, yielding an accuracy of more than 0.94. This model adds to the robustness and diversity of NLP techniques available to process and understand increasing magnitudes of free-form text and speech. Furthermore, the social value of this model is in enabling non-native speakers to comprehend the nuances of a foreign language.
引用
收藏
页数:5
相关论文
共 25 条
[1]  
[Anonymous], AUTOMATIC IDIOM IDEN
[2]  
[Anonymous], IDIOMS LARGEST IDIOM
[3]  
Avramidis E., 2020, P 5 C MACHINE TRANSL, P346
[4]   Processing multiword idiomatic strings Many words in one? [J].
Cacciari, Cristina .
MENTAL LEXICON, 2014, 9 (02) :267-293
[5]  
Chen XY, 2020, FIGURATIVE LANGUAGE PROCESSING, P235
[6]   Bringing Transparency Design into Practice [J].
Eiband, Malin ;
Schneider, Hanna ;
Bilandzic, Mark ;
Fazekas-Con, Julian ;
Haug, Mareike ;
Hussmann, Heinrich .
IUI 2018: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2018, :211-223
[7]  
Feldman Anna, 2013, Computational Linguistics and Intelligent Text Processing. 14th International Conference, CICLing 2013. Proceedings, P435, DOI 10.1007/978-3-642-37247-6_35
[8]  
Klebanov B. B., 2013, ACM T SPEECH LANG PR, V10, DOI [10.1145/2483969.2483974, DOI 10.1145/2483969.2483974]
[9]  
Mikolov T., 2013, P WORKSHOP ICLR 2013, P1
[10]  
Pelosi Serena, 2020, Advanced Information Networking and Applications. Proceedings of the 34th International Conference on Advanced Information Networking and Applications (AINA-2020). Advances in Intelligent Systems and Computing (AISC 1151), P1069, DOI 10.1007/978-3-030-44041-1_92