Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts

被引:29
作者
Kolides, Adam [1 ]
Nawaz, Alyna [1 ]
Rathor, Anshu [1 ]
Beeman, Denzel [1 ]
Hashmi, Muzammil [1 ]
Fatima, Sana [1 ]
Berdik, David [1 ]
Al-Ayyoub, Mahmoud [2 ]
Jararweh, Yaser [1 ]
机构
[1] Duquesne Univ, Pittsburgh, PA USA
[2] Jordan Univ Sci & Technol, Irbid, Jordan
关键词
Pre-trained models; Self-supervised learning; Natural Language Processing; Computer vision; Image processing; Transformers; Machine learning models; Foundation models in robotics; Transfer learning; In-context learning; Self-attention; Fine-tuning;
D O I
10.1016/j.simpat.2023.102754
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the emergence of foundation models (FMs) that are trained on large amounts of data at scale and adaptable to a wide range of downstream applications, AI is experiencing a paradigm revolution. BERT, T5, ChatGPT, GPT-3, Codex, DALL-E, Whisper, and CLIP are now the foundation for new applications ranging from computer vision to protein sequence study and from speech recognition to coding. Earlier models had a reputation of starting from scratch with each new challenge. The capacity to experiment with, examine, and comprehend the capabilities and potentials of next-generation FMs is critical to undertaking this research and guiding its path. Nevertheless, these models are currently inaccessible as the resources required to train these models are highly concentrated in industry, and even the assets (data, code) required to replicate their training are frequently not released due to their demand in the real-time industry. At the moment, only large tech companies such as OpenAI, Google, Facebook, and Baidu can afford to construct FMs. We attempt to analyze and examine the main capabilities, key implementations, technological fundamentals, and socially constructed possible consequences of these models inside this research. Despite the expected widely publicized use of FMs, we still lack a comprehensive knowledge of how they operate, why they underperform, and what they are even capable of because of their emerging global qualities. To deal with these problems, we believe that much critical research on FMs would necessitate extensive multidisciplinary collaboration, given their essentially social and technical structure. Throughout the investigation, we will also have to deal with the problem of misrepresentation created by these systems. If FMs live up to their promise, AI might see far wider commercial use. As researchers studying the ramifications on society, we believe FMs will lead the way in massive changes. They are closely managed for the time being, so we should have time to comprehend their implications before they become a major concern.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Deep Pre-trained Models for Computer Vision Applications: Traffic sign recognition
    Bouaafia, Soulef
    Messaoud, Seifeddine
    Maraoui, Amna
    Ammari, Ahmed Chiheb
    Khriji, Lazhar
    Machhout, Mohsen
    2021 18TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2021, : 23 - 28
  • [22] Comparing pre-trained language models for Spanish hate speech detection
    Miriam Plaza-del-Arco, Flor
    Dolores Molina-Gonzalez, M.
    Alfonso Urena-Lopez, L.
    Teresa Martin-Valdivia, M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 166
  • [23] Simple and Effective Multimodal Learning Based on Pre-Trained Transformer Models
    Miyazawa, Kazuki
    Kyuragi, Yuta
    Nagai, Takayuki
    IEEE ACCESS, 2022, 10 : 29821 - 29833
  • [24] Bi-tuning: Efficient Transfer from Pre-trained Models
    Zhong, Jincheng
    Ma, Haoyu
    Wang, Ximei
    Kou, Zhi
    Long, Mingsheng
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT V, 2023, 14173 : 357 - 373
  • [25] Emotional Paraphrasing Using Pre-trained Language Models
    Casas, Jacky
    Torche, Samuel
    Daher, Karl
    Mugellini, Elena
    Abou Khaled, Omar
    2021 9TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2021,
  • [26] SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination
    Yuan, Zhuowen
    Wu, Fan
    Long, Yunhui
    Xiao, Chaowei
    Li, Bo
    COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 139 - 155
  • [27] Aspect Based Sentiment Analysis using French Pre-Trained Models
    Essebbar, Abderrahman
    Kane, Bamba
    Guinaudeau, Ophelie
    Chiesa, Valeria
    Quenel, Ilhem
    Chau, Stephane
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
  • [28] Impact of Morphological Segmentation on Pre-trained Language Models
    Westhelle, Matheus
    Bencke, Luciana
    Moreira, Viviane P.
    INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 402 - 416
  • [29] Text clustering based on pre-trained models and autoencoders
    Xu, Qiang
    Gu, Hao
    Ji, ShengWei
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2024, 17
  • [30] Backdoor Pre-trained Models Can Transfer to All
    Shen, Lujia
    Ji, Shouling
    Zhang, Xuhong
    Li, Jinfeng
    Chen, Jing
    Shi, Jie
    Fang, Chengfang
    Yin, Jianwei
    Wang, Ting
    CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 3141 - 3158