Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts

被引:29
|
作者
Kolides, Adam [1 ]
Nawaz, Alyna [1 ]
Rathor, Anshu [1 ]
Beeman, Denzel [1 ]
Hashmi, Muzammil [1 ]
Fatima, Sana [1 ]
Berdik, David [1 ]
Al-Ayyoub, Mahmoud [2 ]
Jararweh, Yaser [1 ]
机构
[1] Duquesne Univ, Pittsburgh, PA USA
[2] Jordan Univ Sci & Technol, Irbid, Jordan
关键词
Pre-trained models; Self-supervised learning; Natural Language Processing; Computer vision; Image processing; Transformers; Machine learning models; Foundation models in robotics; Transfer learning; In-context learning; Self-attention; Fine-tuning;
D O I
10.1016/j.simpat.2023.102754
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the emergence of foundation models (FMs) that are trained on large amounts of data at scale and adaptable to a wide range of downstream applications, AI is experiencing a paradigm revolution. BERT, T5, ChatGPT, GPT-3, Codex, DALL-E, Whisper, and CLIP are now the foundation for new applications ranging from computer vision to protein sequence study and from speech recognition to coding. Earlier models had a reputation of starting from scratch with each new challenge. The capacity to experiment with, examine, and comprehend the capabilities and potentials of next-generation FMs is critical to undertaking this research and guiding its path. Nevertheless, these models are currently inaccessible as the resources required to train these models are highly concentrated in industry, and even the assets (data, code) required to replicate their training are frequently not released due to their demand in the real-time industry. At the moment, only large tech companies such as OpenAI, Google, Facebook, and Baidu can afford to construct FMs. We attempt to analyze and examine the main capabilities, key implementations, technological fundamentals, and socially constructed possible consequences of these models inside this research. Despite the expected widely publicized use of FMs, we still lack a comprehensive knowledge of how they operate, why they underperform, and what they are even capable of because of their emerging global qualities. To deal with these problems, we believe that much critical research on FMs would necessitate extensive multidisciplinary collaboration, given their essentially social and technical structure. Throughout the investigation, we will also have to deal with the problem of misrepresentation created by these systems. If FMs live up to their promise, AI might see far wider commercial use. As researchers studying the ramifications on society, we believe FMs will lead the way in massive changes. They are closely managed for the time being, so we should have time to comprehend their implications before they become a major concern.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    ENGINEERING, 2023, 25 : 51 - 65
  • [2] Pre-trained models: Past, present and future
    Han, Xu
    Zhang, Zhengyan
    Ding, Ning
    Gu, Yuxian
    Liu, Xiao
    Huo, Yuqi
    Qiu, Jiezhong
    Yao, Yuan
    Zhang, Ao
    Zhang, Liang
    Han, Wentao
    Huang, Minlie
    Jin, Qin
    Lan, Yanyan
    Liu, Yang
    Liu, Zhiyuan
    Lu, Zhiwu
    Qiu, Xipeng
    Song, Ruihua
    Tang, Jie
    Wen, Ji-Rong
    Yuan, Jinhui
    Zhao, Wayne Xin
    Zhu, Jun
    AI OPEN, 2021, 2 : 225 - 250
  • [3] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
    Myers, Devon
    Mohawesh, Rami
    Chellaboina, Venkata Ishwarya
    Sathvik, Anantha Lakshmi
    Venkatesh, Praveen
    Ho, Yi-Hui
    Henshaw, Hanna
    Alhawawreh, Muna
    Berdik, David
    Jararweh, Yaser
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1 - 26
  • [4] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
    Devon Myers
    Rami Mohawesh
    Venkata Ishwarya Chellaboina
    Anantha Lakshmi Sathvik
    Praveen Venkatesh
    Yi-Hui Ho
    Hanna Henshaw
    Muna Alhawawreh
    David Berdik
    Yaser Jararweh
    Cluster Computing, 2024, 27 : 1 - 26
  • [5] A Survey on Time-Series Pre-Trained Models
    Ma, Qianli
    Liu, Zhen
    Zheng, Zhenjing
    Huang, Ziyang
    Zhu, Siying
    Yu, Zhongzhong
    Kwok, James T.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 7536 - 7555
  • [6] Pre-trained models for natural language processing: A survey
    Qiu XiPeng
    Sun TianXiang
    Xu YiGe
    Shao YunFan
    Dai Ning
    Huang XuanJing
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897
  • [7] Universal embedding for pre-trained models and data bench
    Cho, Namkyeong
    Cho, Taewon
    Shin, Jaesun
    Jeon, Eunjoo
    Lee, Taehee
    NEUROCOMPUTING, 2025, 619
  • [8] Pre-trained models for natural language processing: A survey
    XiPeng Qiu
    TianXiang Sun
    YiGe Xu
    YunFan Shao
    Ning Dai
    XuanJing Huang
    Science China Technological Sciences, 2020, 63 : 1872 - 1897
  • [9] A survey on moral foundation theory and pre-trained language models: current advances and challenges
    Zangari, Lorenzo
    Greco, Candida Maria
    Picca, Davide
    Tagarelli, Andrea
    AI & SOCIETY, 2025,
  • [10] Leveraging pre-trained language models for mining microbiome-disease relationships
    Nikitha Karkera
    Sathwik Acharya
    Sucheendra K. Palaniappan
    BMC Bioinformatics, 24