Cross-Lingual Transfer Learning for Statistical Type Inference

被引:0
|
作者
Li, Zhiming [1 ]
Xie, Xiaofei [2 ]
Li, Haoliang [3 ]
Xu, Zhengzi [1 ]
Li, Yi [1 ]
Liu, Yang [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Singapore Management Univ, Singapore, Singapore
[3] City Univ Hong Kong, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2022 | 2022年
基金
新加坡国家研究基金会;
关键词
Deep Learning; Transfer Learning; Type Inference;
D O I
10.1145/3533767.3534411
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Hitherto statistical type inference systems rely thoroughly on supervised learning approaches, which require laborious manual effort to collect and label large amounts of data. Most Turing-complete imperative languages share similar control- and data-flow structures, which make it possible to transfer knowledge learned from one language to another. In this paper, we propose a cross-lingual transfer learning framework, Plato, for statistical type inference, which allows us to leverage prior knowledge learned from the labeled dataset of one language and transfer it to the others, e.g., Python to JavaScript, Java to JavaScript, etc. Plato is powered by a novel kernelized attention mechanism to constrain the attention scope of the backbone Transformer model such that model is forced to base its prediction on commonly shared features among languages. In addition, we propose the syntax enhancement that augments the learning on the feature overlap among language domains. Furthermore, Plato can also be used to improve the performance of the conventional supervised-based type inference by introducing cross-language augmentation, which enables the model to learn more general features across multiple languages. We evaluated Plato under two settings: 1) under the cross-domain scenario that the target language data is not labeled or labeled partially, the results show that Plato outperforms the state-of-the-art domain transfer techniques by a large margin, e.g., it improves the Python to Type-Script baseline by +14.6%@EM, +18.6%@weighted-F1, and 2) under the conventional monolingual supervised scenario, Plato improves the Python baseline by +4.10%@EM, +1.90%@weighted-F1 with the introduction of the cross-lingual augmentation.
引用
收藏
页码:239 / 250
页数:12
相关论文
共 50 条
  • [21] Combining Cross-lingual and Cross-task Supervision for Zero-Shot Learning
    Pikuliak, Matus
    Simko, Marian
    TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 162 - 170
  • [22] End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning
    Chen, Yuan-Jui
    Tu, Tao
    Yeh, Cheng-chieh
    Lee, Hung-yi
    INTERSPEECH 2019, 2019, : 2075 - 2079
  • [23] Detecting Cyber Threats in Non-English Dark Net Markets: A Cross-Lingual Transfer Learning Approach
    Ebrahimi, Mohammadreza
    Surdeanu, Mihai
    Samtani, Sagar
    Chen, Hsinchun
    2018 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2018, : 85 - 90
  • [24] An analysis on language transfer of pre-trained language model with cross-lingual post-training
    Son, Suhyune
    Park, Chanjun
    Lee, Jungseob
    Shim, Midan
    Lee, Chanhee
    Jang, Yoonna
    Seo, Jaehyung
    Lim, Jungwoo
    Lim, Heuiseok
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 267
  • [25] A Cross-Modal and Cross-lingual Study of Iconicity in Language: Insights From Deep Learning
    de Varda, Andrea Gregor
    Strapparava, Carlo
    COGNITIVE SCIENCE, 2022, 46 (06)
  • [26] Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation
    Zolzaya Byambadorj
    Ryota Nishimura
    Altangerel Ayush
    Kengo Ohta
    Norihide Kitaoka
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [27] Cross-lingual deep learning model for gender-based emotion detection
    Bhattacharya, Sudipta
    Mishra, Brojo Kishore
    Borah, Samarjeet
    Das, Nabanita
    Dey, Nilanjan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (09) : 25969 - 26007
  • [28] Cross-lingual deep learning model for gender-based emotion detection
    Sudipta Bhattacharya
    Brojo Kishore Mishra
    Samarjeet Borah
    Nabanita Das
    Nilanjan Dey
    Multimedia Tools and Applications, 2024, 83 (9) : 25969 - 26007
  • [29] Cross-Lingual Few-Shot Hate Speech and Offensive Language Detection Using Meta Learning
    Mozafari, Marzieh
    Farahbakhsh, Reza
    Crespi, Noel
    IEEE ACCESS, 2022, 10 : 14880 - 14896
  • [30] A comparative study of cross-lingual sentiment analysis
    Priban, Pavel
    Smid, Jakub
    Steinberger, Josef
    Mistera, Adam
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 247