Cross-Lingual Transfer Learning for Statistical Type Inference

被引：0

作者：

Li, Zhiming ^{[1
]}

Xie, Xiaofei ^{[2
]}

Li, Haoliang ^{[3
]}

Xu, Zhengzi ^{[1
]}

Li, Yi ^{[1
]}

Liu, Yang ^{[1
]}

机构：

[1] Nanyang Technol Univ, Singapore, Singapore

[2] Singapore Management Univ, Singapore, Singapore

[3] City Univ Hong Kong, Hong Kong, Peoples R China

来源：

PROCEEDINGS OF THE 31ST ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2022 | 2022年

基金：

新加坡国家研究基金会;

关键词：

Deep Learning; Transfer Learning; Type Inference;

D O I：

10.1145/3533767.3534411

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Hitherto statistical type inference systems rely thoroughly on supervised learning approaches, which require laborious manual effort to collect and label large amounts of data. Most Turing-complete imperative languages share similar control- and data-flow structures, which make it possible to transfer knowledge learned from one language to another. In this paper, we propose a cross-lingual transfer learning framework, Plato, for statistical type inference, which allows us to leverage prior knowledge learned from the labeled dataset of one language and transfer it to the others, e.g., Python to JavaScript, Java to JavaScript, etc. Plato is powered by a novel kernelized attention mechanism to constrain the attention scope of the backbone Transformer model such that model is forced to base its prediction on commonly shared features among languages. In addition, we propose the syntax enhancement that augments the learning on the feature overlap among language domains. Furthermore, Plato can also be used to improve the performance of the conventional supervised-based type inference by introducing cross-language augmentation, which enables the model to learn more general features across multiple languages. We evaluated Plato under two settings: 1) under the cross-domain scenario that the target language data is not labeled or labeled partially, the results show that Plato outperforms the state-of-the-art domain transfer techniques by a large margin, e.g., it improves the Python to Type-Script baseline by +14.6%@EM, +18.6%@weighted-F1, and 2) under the conventional monolingual supervised scenario, Plato improves the Python baseline by +4.10%@EM, +1.90%@weighted-F1 with the introduction of the cross-lingual augmentation.

引用

页码：239 / 250

页数：12

共 50 条

[41] A joint learning approach with knowledge injection for zero-shot cross-lingual hate speech detection
Pamungkas, Endang Wahyu
Basile, Valerio
Patti, Viviana
INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (04)
[42] Towards an entity relation extraction framework in the cross-lingual context
Yu, Chuanming
Xue, Haodong
Wang, Manyi
An, Lu
ELECTRONIC LIBRARY, 2021, 39 (03) : 411 - 434
[43] Improving Transfer Learning in Cross Lingual Opinion Analysis Through Negative Transfer Detection
Gui, Lin
Lu, Qin
Xu, Ruifeng
Wei, Qikang
Cao, Yuhui
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2015, 2015, 9403 : 394 - 406
[44] Cross lingual transfer learning for sentiment analysis of Italian TripAdvisor reviews
Catelli, Rosario
Bevilacqua, Luca
Mariniello, Nicola
di Carlo, Vladimiro Scotto
Magaldi, Massimo
Fujita, Hamido
De Pietro, Giuseppe
Esposito, Massimo
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 209
[45] Statistical Type Inference for Incomplete Programs
Peng, Yaohui
Xie, Jing
Yang, Qiongling
Guo, Hanwen
Li, Qingan
Xue, Jingling
Yuan, Mengting
PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 720 - 732
[46] Towards zero-shot cross-lingual named entity disambiguation
Barrena, Ander
Soroa, Aitor
Agirre, Eneko
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
[47] Assorted Attention Network for Cross-Lingual Language-to-Vision Retrieval
Yu, Tan
Yang, Yi
Fei, Hongliang
Li, Yi
Chen, Xiaodong
Li, Ping
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2444 - 2454
[48] Cross-Lingual Image Caption Generation Based on Visual Attention Model
Wang, Bin
Wang, Cungang
Zhang, Qian
Su, Ying
Wang, Yang
Xu, Yanyan
IEEE ACCESS, 2020, 8 : 104543 - 104554
[49] Cross-lingual Text Classification via Model Translation with Limited Dictionaries
Xu, Ruochen
Yang, Yiming
Liu, Hanxiao
Hsi, Andrew
CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 95 - 104
[50] Multilingual Semantic Sourcing using Product Images for Cross-lingual Alignment
Mangrulkar, Sourab
Ankith, M. S.
Sembium, Vivek
COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 41 - 51

← 1 2 3 4 5 →