Universal embedding for pre-trained models and data bench

被引:0
|
作者
Cho, Namkyeong [1 ]
Cho, Taewon [2 ]
Shin, Jaesun [2 ]
Jeon, Eunjoo [2 ]
Lee, Taehee [2 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Ctr Math Machine Learning & its Applicat CM2LA, Dept Math, Pohang 37673, Gyeongbuk, South Korea
[2] Samsung SDS, 125 Olymp Ro 35 Gil, Seoul 05510, South Korea
基金
新加坡国家研究基金会;
关键词
Transfer learning; Pretrained models; Graph neural networks;
D O I
10.1016/j.neucom.2024.129107
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The transformer architecture has shown significant improvements in the performance of various natural language processing (NLP) tasks. One of the great advantages of transformer-based model is that they allow for the addition of an extra layer to a pre-trained model (PTM) and fine-tuning, rather than requiring the development of a separate architecture for each task. This approach has provided great promising performance in NLP tasks. Therefore, selecting an appropriate PTM from the model zoo, such as Hugging Face, becomes a crucial task. Despite the importance of PTM selection, it still requires further investigation. The main challenge in PTM selection for NLP tasks is the lack of a publicly available benchmark to evaluate model performance for each task and dataset. To address this challenge, we introduce the first public data benchmark to evaluate the performance of popular transformer-based models on diverse ranges of NLP tasks. Furthermore, we propose graph representations of transformer-based models with node features that represent the matrix weight on each layer. Empirical results demonstrate that our proposed graph neural network (GNN) model outperforms existing PTM selection methods.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Brain MRI classification for tumor detection with deep pre-trained models
    Yazzeoui, Ameni
    Oueslati, Afef Elloumi
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES, SIGNAL AND IMAGE PROCESSING, ATSIP 2024, 2024, : 182 - 187
  • [32] Evaluation of Pre-Trained CNN Models for Geographic Fake Image Detection
    Fezza, Sid Ahmed
    Ouis, Mohammed Yasser
    Kaddar, Bachir
    Hamidouche, Wassim
    Hadid, Abdenour
    2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
  • [33] Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems
    Salinas, Nelly Rosaura Palacios
    Baratchi, Mitra
    van Rijn, Jan N.
    Vollrath, Andreas
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: APPLIED DATA SCIENCE TRACK, PT V, 2021, 12979 : 447 - 462
  • [34] Diet Code Is Healthy: Simplifying Programs for Pre-trained Models of Code
    Zhang, Zhaowei
    Zhang, Hongyu
    Shen, Beijun
    Gu, Xiaodong
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 1073 - 1084
  • [35] Bi-tuning: Efficient Transfer from Pre-trained Models
    Zhong, Jincheng
    Ma, Haoyu
    Wang, Ximei
    Kou, Zhi
    Long, Mingsheng
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT V, 2023, 14173 : 357 - 373
  • [36] Evaluating the Feasibility of Reusing Pre-trained Thermal Models in the Residential Sector
    Hossain, Md Monir
    Zhang, Tianyu
    Ardakanian, Omid
    PROCEEDINGS OF THE 1ST ACM INTERNATIONAL WORKSHOP ON URBAN BUILDING ENERGY SENSING, CONTROLS, BIG DATA ANALYSIS, AND VISUALIZATION (URBSYS '19), 2019, : 23 - 32
  • [37] An analysis of pre-trained stable diffusion models through a semantic lens
    Bonechi, Simone
    Andreini, Paolo
    Corradini, Barbara Toniella
    Scarselli, Franco
    NEUROCOMPUTING, 2025, 614
  • [38] Pre-trained deep learning models for brain MRI image classification
    Krishnapriya, Srigiri
    Karuna, Yepuganti
    FRONTIERS IN HUMAN NEUROSCIENCE, 2023, 17
  • [39] Mass detection in mammograms using pre-trained deep learning models
    Agarwal, Richa
    Diaz, Oliver
    Llado, Xavier
    Marti, Robert
    14TH INTERNATIONAL WORKSHOP ON BREAST IMAGING (IWBI 2018), 2018, 10718
  • [40] Quality of Pre-trained Deep-Learning Models for Palmprint Recognition
    Rosca, Valentin
    Ignat, Anca
    2020 22ND INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2020), 2020, : 202 - 209