Transformers for Tabular Data Representation: A Survey of Models and Applications

被引:27
作者
Badaro, Gilbert [1 ]
Saeed, Mohammed [1 ]
Papotti, Paolo [1 ]
机构
[1] EURECOM, Biot, France
关键词
Computational linguistics - Learning algorithms - Natural language processing systems;
D O I
10.1162/tacl_a_00544
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last few years, the natural language processing community has witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in tabular data, recent research efforts extend LMs by developing neural representations for structured data. In this article, we present a survey that analyzes these efforts. We first abstract the different systems according to a traditional machine learning pipeline in terms of training data, input representation, model training, and supported downstream tasks. For each aspect, we characterize and compare the proposed solutions. Finally, we discuss future work directions.
引用
收藏
页码:227 / 249
页数:23
相关论文
共 109 条
[1]  
Aly R., 2021, Track Datasets Benchmarks 1: NeurIPS Datasets Benchmarks 2021, P1
[2]  
[Anonymous], 2018, P 7 JOINT C LEXICAL, DOI DOI 10.18653/V1/S18-2009
[3]  
Antoun W., 2020, P 4 WORKSHOP OPEN SO, P9
[4]  
Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165, 10.48550/arXiv.2005.14165]
[5]  
Badaro G., 2014, A large scale Arabic sentiment lexicon for Arabic opinion mining, P165
[6]   Transformers for Tabular Data Representation: A Tutorial on Models and Applications [J].
Badaro, Gilbert ;
Papotti, Paolo .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (12) :3746-3749
[7]   A Link Prediction Approach for Accurately Mapping a Large-scale Arabic Lexical Resource to English WordNet [J].
Badaro, Gilbert ;
Hajj, Hazem ;
Habash, Nizar .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (06)
[8]   A Survey of Opinion Mining in Arabic: A Comprehensive System Perspective Covering Challenges and Advances in Tools, Resources, Models, Applications, and Visualizations [J].
Badaro, Gilbert ;
Baly, Ramy ;
Hajj, Hazem ;
El-Hajj, Wassim ;
Shaban, Khaled Bashir ;
Habash, Nizar ;
Al-Sallab, Ahmad ;
Hamdi, Ali .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (03)
[9]  
Badaro Gilbert., 2018, OSACT, P326
[10]  
Bakarov A, 2018, Arxiv, DOI [arXiv:1801.09536, 10.48550/ARXIV.1801.09536]