A Concise Review of Long Context in Large Language Models

被引:0
作者
Huang, Haitao [1 ]
Liang, Zijing [1 ]
Fang, Zirui [1 ]
Wang, Zhiyuan [1 ]
Chen, Mingxiu [1 ]
Hong, Yifan [1 ]
Liu, Ke [1 ]
Shang, Penghui [1 ]
机构
[1] Zhiyuan Res Inst, Hangzhou 310000, Zhejiang, Peoples R China
来源
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ALGORITHMS, SOFTWARE ENGINEERING, AND NETWORK SECURITY, ASENS 2024 | 2024年
关键词
Component; Large language model; Long context; Self-attention; Retrieval augment;
D O I
10.1145/3677182.3677282
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sincerely in part to the rise of high-performance computer systems and transformer models, natural language processing has advanced. Also, a multitude of applications built on large language models continually improve people's cognitive abilities. Large language models continue to face difficulties when dealing with long context input. Many studies have suggested various specific strategies to address the challenge of extended context, however as of yet, no thorough summary of these studies exists. In this paper, we discuss the issues raised and the developments that have occurred in the long context application of large language models, and we attempt to suggest future directions for research and development.
引用
收藏
页码:563 / 566
页数:4
相关论文
共 22 条
  • [1] Ainslie J, 2023, Arxiv, DOI [arXiv:2305.13245, 10.48550/arXiv.2305.13245, DOI 10.48550/ARXIV.2305.13245]
  • [2] Bai YS, 2024, Arxiv, DOI arXiv:2308.14508
  • [3] Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150
  • [4] bloc, 2023, Add NTK-Aware interpolation "by parts"correction
  • [5] Chen SY, 2023, Arxiv, DOI arXiv:2306.15595
  • [6] Chen YK, 2023, Arxiv, DOI [arXiv:2309.12307, DOI 10.48550/ARXIV.2309.12307]
  • [7] Child R, 2019, Arxiv, DOI arXiv:1904.10509
  • [8] Dai ZH, 2019, Arxiv, DOI arXiv:1901.02860
  • [9] emozilla, 2023, Dynamically Scaled RoPE further increases performance of long context LLaMA with zero fine-tuning
  • [10] Krishna K., 2023, arXiv