HYTREL: Hypergraph-enhanced Tabular Data Representation Learning

被引:0
作者
Chen, Pei [1 ,2 ]
Sarkar, Soumajyoti [2 ]
Lausen, Leonard [2 ]
Srinivasan, Balasubramaniam [2 ]
Zha, Sheng [2 ]
Huang, Ruihong [1 ]
Karypis, George [2 ]
机构
[1] Texas A&M Univ, College Stn, TX 77843 USA
[2] Amazon Web Serv, Seattle, WA USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Language models pretrained on large collections of tabular data have demonstrated their effectiveness in several downstream tasks. However, many of these models do not take into account the row/column permutation invariances, hierarchical structure, etc. that exist in tabular data. To alleviate these limitations, we propose HYTREL, a tabular language model, that captures the permutation invariances and three more structural properties of tabular data by using hypergraphs-where the table cells make up the nodes and the cells occurring jointly together in each row, column, and the entire table are used to form three different types of hyperedges. We show that HYTREL is maximally invariant under certain conditions for tabular data, i.e., two tables obtain the same representations via HYTREL iff the two tables are identical up to permutations. Our empirical results demonstrate that HYTREL consistently outperforms other competitive baselines on four downstream tasks with minimal pretraining, illustrating the advantages of incorporating the inductive biases associated with tabular data into the representations. Finally, our qualitative analyses showcase that HYTREL can assimilate the table structures to generate robust representations for the cells, rows, columns, and the entire table. (1)
引用
收藏
页数:21
相关论文
共 53 条
[1]  
Agarwal S., 2006, P 12 IFAC S INF CONT, P17
[2]  
Andrejczuk Ewa., 2022, Table-to-text generation and pretraining with tabt5
[3]  
Arik SO, 2021, AAAI CONF ARTIF INTE, V35, P6679
[4]  
Arya Devanshu, 2020, ARXIV201004558
[5]  
Ba J.L., 2016, Layer Normalization
[6]   RANDOM GRAPH ISOMORPHISM [J].
BABAI, L ;
ERDOS, P ;
SELKOW, SM .
SIAM JOURNAL ON COMPUTING, 1980, 9 (03) :628-635
[7]  
Chien Eli, 2022, INT C LEARN REPR
[8]  
Clark K, 2020, ICLR, DOI [10.48550/arXiv.2003.10555, DOI 10.48550/ARXIV.2003.10555]
[9]  
Dash S., 2022, Findings of the Association for Computational Linguistics: NAACL, P788, DOI [10.18653/v1/2022.findings-naacl.59, DOI 10.18653/V1/2022.FINDINGS-NAACL.59]
[10]   TURL: Table Understanding through Representation Learning [J].
Deng, Xiang ;
Sun, Huan ;
Lees, Alyssa ;
Wu, You ;
Yu, Cong .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (03) :307-319