Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts

被引:29
作者
Kolides, Adam [1 ]
Nawaz, Alyna [1 ]
Rathor, Anshu [1 ]
Beeman, Denzel [1 ]
Hashmi, Muzammil [1 ]
Fatima, Sana [1 ]
Berdik, David [1 ]
Al-Ayyoub, Mahmoud [2 ]
Jararweh, Yaser [1 ]
机构
[1] Duquesne Univ, Pittsburgh, PA USA
[2] Jordan Univ Sci & Technol, Irbid, Jordan
关键词
Pre-trained models; Self-supervised learning; Natural Language Processing; Computer vision; Image processing; Transformers; Machine learning models; Foundation models in robotics; Transfer learning; In-context learning; Self-attention; Fine-tuning;
D O I
10.1016/j.simpat.2023.102754
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the emergence of foundation models (FMs) that are trained on large amounts of data at scale and adaptable to a wide range of downstream applications, AI is experiencing a paradigm revolution. BERT, T5, ChatGPT, GPT-3, Codex, DALL-E, Whisper, and CLIP are now the foundation for new applications ranging from computer vision to protein sequence study and from speech recognition to coding. Earlier models had a reputation of starting from scratch with each new challenge. The capacity to experiment with, examine, and comprehend the capabilities and potentials of next-generation FMs is critical to undertaking this research and guiding its path. Nevertheless, these models are currently inaccessible as the resources required to train these models are highly concentrated in industry, and even the assets (data, code) required to replicate their training are frequently not released due to their demand in the real-time industry. At the moment, only large tech companies such as OpenAI, Google, Facebook, and Baidu can afford to construct FMs. We attempt to analyze and examine the main capabilities, key implementations, technological fundamentals, and socially constructed possible consequences of these models inside this research. Despite the expected widely publicized use of FMs, we still lack a comprehensive knowledge of how they operate, why they underperform, and what they are even capable of because of their emerging global qualities. To deal with these problems, we believe that much critical research on FMs would necessitate extensive multidisciplinary collaboration, given their essentially social and technical structure. Throughout the investigation, we will also have to deal with the problem of misrepresentation created by these systems. If FMs live up to their promise, AI might see far wider commercial use. As researchers studying the ramifications on society, we believe FMs will lead the way in massive changes. They are closely managed for the time being, so we should have time to comprehend their implications before they become a major concern.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Compressing Pre-trained Models of Code into 3 MB
    Shi, Jieke
    Yang, Zhou
    Xu, Bowen
    Kang, Hong Jin
    Lo, David
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [32] Transfer learning with pre-trained conditional generative models
    Yamaguchi, Shin'ya
    Kanai, Sekitoshi
    Kumagai, Atsutoshi
    Chijiwa, Daiki
    Kashima, Hisashi
    MACHINE LEARNING, 2025, 114 (04)
  • [33] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
  • [34] A Comparative Study on Pre-Trained Models Based on BERT
    Zhang, Minghua
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 326 - 330
  • [35] Pre-Trained Language Models for Text Generation: A Survey
    Li, Junyi
    Tang, Tianyi
    Zhao, Wayne Xin
    Nie, Jian-Yun
    Wen, Ji-Rong
    ACM COMPUTING SURVEYS, 2024, 56 (09)
  • [36] Bibimbap : Pre-trained models ensemble for Domain Generalization
    Kang, Jinho
    Kim, Taero
    Kim, Yewon
    Oh, Changdae
    Jung, Jiyoung
    Chang, Rakwoo
    Song, Kyungwoo
    PATTERN RECOGNITION, 2024, 151
  • [37] Pre-trained language models for keyphrase prediction: A review
    Umair, Muhammad
    Sultana, Tangina
    Lee, Young-Koo
    ICT EXPRESS, 2024, 10 (04): : 871 - 890
  • [38] StaResGRU-CNN with CMedLMs: A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence
    Ni, Pin
    Li, Gangmin
    Hung, Patrick C. K.
    Chang, Victor
    APPLIED SOFT COMPUTING, 2021, 113
  • [39] TARGET SPEECH EXTRACTION WITH PRE-TRAINED SELF-SUPERVISED LEARNING MODELS
    Peng, Junyi
    Delcroix, Marc
    Ochiai, Tsubasa
    Plchot, Oldrich
    Araki, Shoko
    Cemocky, Jan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 10421 - 10425
  • [40] Discrimination Bias Detection Through Categorical Association in Pre-Trained Language Models
    Dusi, Michele
    Arici, Nicola
    Gerevini, Alfonso Emilio
    Putelli, Luca
    Serina, Ivan
    IEEE ACCESS, 2024, 12 : 162651 - 162667