Survey of Different Large Language Model Architectures: Trends, Benchmarks, and Challenges

被引:1
|
作者
Shao, Minghao [1 ]
Basit, Abdul [2 ]
Karri, Ramesh [1 ]
Shafique, Muhammad [2 ]
机构
[1] NYU, Tandon Sch Engn, New York, NY 10012 USA
[2] New York Univ Abu Dhabi, Abu Dhabi Engn Div, Abu Dhabi, U Arab Emirates
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Surveys; Transformers; Benchmark testing; Encoding; Large language models; Adaptation models; Market research; Decoding; Training; Computational modeling; Large language models (LLMs); Transformer architecture; generative models; survey; multimodal learning; deep learning; natural language processing (NLP); GENERATIVE ADVERSARIAL NETWORKS;
D O I
10.1109/ACCESS.2024.3482107
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large Language Models (LLMs) represent a class of deep learning models adept at understanding natural language and generating coherent responses to various prompts or queries. These models far exceed the complexity of conventional neural networks, often encompassing dozens of neural network layers and containing billions to trillions of parameters. They are typically trained on vast datasets, utilizing architectures based on transformer blocks. Present-day LLMs are multi-functional, capable of performing a range of tasks from text generation and language translation to question answering, as well as code generation and analysis. An advanced subset of these models, known as Multimodal Large Language Models (MLLMs), extends LLM capabilities to process and interpret multiple data modalities, including images, audio, and video. This enhancement empowers MLLMs with capabilities like video editing, image comprehension, and captioning for visual content. This survey provides a comprehensive overview of the recent advancements in LLMs. We begin by tracing the evolution of LLMs and subsequently delve into the advent and nuances of MLLMs. We analyze emerging state-of-the-art MLLMs, exploring their technical features, strengths, and limitations. Additionally, we present a comparative analysis of these models and discuss their challenges, potential limitations, and prospects for future development.
引用
收藏
页码:188664 / 188706
页数:43
相关论文
共 50 条
  • [41] Explainability for Large Language Models: A Survey
    Zhao, Haiyan
    Chen, Hanjie
    Yang, Fan
    Liu, Ninghao
    Deng, Huiqi
    Cai, Hengyi
    Wang, Shuaiqiang
    Yin, Dawei
    Du, Mengnan
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (02)
  • [42] A survey on LoRA of large language models
    Mao, Yuren
    Ge, Yuhang
    Fan, Yijiang
    Xu, Wenyi
    Mi, Yu
    Hu, Zhonghao
    Gao, Yunjun
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (07)
  • [43] A Survey on Evaluation of Large Language Models
    Chang, Yupeng
    Wang, Xu
    Wang, Jindong
    Wu, Yuan
    Yang, Linyi
    Zhu, Kaijie
    Chen, Hao
    Yi, Xiaoyuan
    Wang, Cunxiang
    Wang, Yidong
    Ye, Wei
    Zhang, Yue
    Chang, Yi
    Yu, Philip S.
    Yang, Qiang
    Xie, Xing
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (03)
  • [44] A survey on large language models for recommendation
    Wu, Likang
    Zheng, Zhi
    Qiu, Zhaopeng
    Wang, Hao
    Gu, Hongchao
    Shen, Tingjia
    Qin, Chuan
    Zhu, Chen
    Zhu, Hengshu
    Liu, Qi
    Xiong, Hui
    Chen, Enhong
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2024, 27 (05):
  • [45] Visual language integration: A survey and open challenges
    Park, Sang-Min
    Kim, Young-Gab
    COMPUTER SCIENCE REVIEW, 2023, 48
  • [46] Evaluating Large Language Models in Process Mining: Capabilities, Benchmarks, and Evaluation Strategies
    Berti, Alessandro
    Kourani, Humam
    Haefke, Hannes
    Li, Chiao-Yun
    Schuster, Daniel
    ENTERPRISE, BUSINESS-PROCESS AND INFORMATION SYSTEMS MODELING, BPMDS 2024, EMMSAD 2024, 2024, 511 : 13 - 21
  • [47] Toward Knowledge Integration With Large Language Model for End-to-End Aspect-Based Sentiment Analysis in Social Multimedia
    Ma, Zhiyuan
    Pan, Meiqi
    Hou, Yunfeng
    Yang, Ling
    Wang, Wei
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024,
  • [48] Ethical and Theological Challenges of Large Language Models
    Strahornik, Vojko
    BOGOSLOVNI VESTNIK-THEOLOGICAL QUARTERLY-EPHEMERIDES THEOLOGICAE, 2023, 83 (04): : 839 - 852
  • [50] A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives
    Peccerillo, Biagio
    Mannino, Mirco
    Mondelli, Andrea
    Bartolini, Sandro
    JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 129