Cross-lingual distillation for domain knowledge transfer with sentence transformers

被引:0
|
作者
Piperno, Ruben [1 ,2 ]
Bacco, Luca [1 ,3 ]
Dell'Orletta, Felice [1 ]
Merone, Mario [2 ]
Pecchia, Leandro [2 ,4 ]
机构
[1] Inst Computat Linguist Antonio Zampolli, Natl Res Council, ItaliaNLP Lab, Via Giuseppe Moruzzi, 1, I-56124 Pisa, Italy
[2] Univ Campus Biomed Roma, Dept Engn, Res Unit Intelligent Technol Hlth & Wellbeing, Via Alvaro Portillo 21, I-00128 Rome, Italy
[3] Univ Campus Biomed Roma, Dept Engn, Res Unit Comp Syst & Bioinformat, Via Alvaro Portillo 21, I-00128 Rome, Italy
[4] Fdn Policlin Univ Campus Biomed Roma, Via Alvaro del Portillo 200, I-00128 Rome, Italy
关键词
Cross-lingual learning; Knowledge distillation; Sentence transformers; Biomedical domain; Domain adaptation;
D O I
10.1016/j.knosys.2025.113079
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in Natural Language Processing (NLP) have substantially enhanced language understanding. However, non-English languages, especially in specialized and low-resource domains like biomedicine, remain largely underrepresented. Bridging this gap is essential for promoting inclusivity and expanding the global applicability of NLP technologies. This study presents a cross-lingual knowledge distillation framework that utilizes sentence transformers to improve domain-specific NLP capabilities in non-English languages. Specifically, the framework focuses on biomedical text classification tasks. By aligning sentence embeddings between a teacher model trained on English biomedical corpora and a multilingual student model, the proposed method effectively transfers both domain-specific and task-specific knowledge. This alignment allows the student model to efficiently process and adapt to biomedical texts in Spanish, French, and German, particularly in low-resource settings with limited tuning data. Extensive experiments with domain-adapted models like BioBERT and multilingual BERT with machine-translated text pairs demonstrate substantial performance improvements in downstream biomedical NLP tasks. The proposed framework proves highly effective in scenarios characterized by limited training data availability. The results highlight the scalability and effectiveness of this approach, facilitating the development of robust multilingual models tailored to the biomedical domain, thus advancing global accessibility and impact in biomedical NLP applications.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Cross-lingual Search for e-Commerce based on Query Translatability and Mixed-Domain Fine-Tuning
    Perez-Martin, Jesus
    Gomez-Robles, Jorge
    Gutierrez-Fandino, Asier
    Adsul, Pankaj
    Rajanala, Sravanthi
    Lezcano, Leonardo
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 892 - 898
  • [42] MCLS: A Large-Scale Multimodal Cross-Lingual Summarization Dataset
    Shi, Xiaorui
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 273 - 288
  • [43] Knowledge Distillation for Semi-supervised Domain Adaptation
    Orbes-Arteainst, Mauricio
    Cardoso, Jorge
    Sorensen, Lauge
    Igel, Christian
    Ourselin, Sebastien
    Modat, Marc
    Nielsen, Mads
    Pai, Akshay
    OR 2.0 CONTEXT-AWARE OPERATING THEATERS AND MACHINE LEARNING IN CLINICAL NEUROIMAGING, 2019, 11796 : 68 - 76
  • [44] Spirit Distillation: A Model Compression Method with Multi-domain Knowledge Transfer
    Wu, Zhiyuan
    Jiang, Yu
    Zhao, Minghao
    Cui, Chupeng
    Yang, Zongmin
    Xue, Xinhui
    Qi, Hong
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 553 - 565
  • [45] Combining Cross-lingual and Cross-task Supervision for Zero-Shot Learning
    Pikuliak, Matus
    Simko, Marian
    TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 162 - 170
  • [46] Latent domain knowledge distillation for nighttime semantic segmentation
    Liu, Yunan
    Wang, Simiao
    Wang, Chunpeng
    Lu, Mingyu
    Sang, Yu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [47] Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation
    Nguyen-Meidine, Le Thanh
    Granger, Eric
    Kiran, Madhu
    Dolz, Jose
    Blais-Morin, Louis-Antoine
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [48] COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval
    Li, Xirong
    Xu, Chaoxi
    Wang, Xiaoxu
    Lan, Weiyu
    Jia, Zhengxiong
    Yang, Gang
    Xu, Jieping
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (09) : 2347 - 2360
  • [49] Cross-lingual Capsule Network for Hate Speech Detection in Social Media
    Jiang, Aiqi
    Zubiaga, Arkaitz
    PROCEEDINGS OF THE 32ND ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA (HT '21), 2021, : 217 - 223
  • [50] A lightweight residual network based on improved knowledge transfer and quantized distillation for cross-domain fault diagnosis of rolling bearings
    Guo, Wei
    Li, Xiang
    Shen, Ziqian
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245