A Concise Review of Long Context in Large Language Models

被引：0

作者：

Huang, Haitao ^{[1
]}

Liang, Zijing ^{[1
]}

Fang, Zirui ^{[1
]}

Wang, Zhiyuan ^{[1
]}

Chen, Mingxiu ^{[1
]}

Hong, Yifan ^{[1
]}

Liu, Ke ^{[1
]}

Shang, Penghui ^{[1
]}

机构：

[1] Zhiyuan Res Inst, Hangzhou 310000, Zhejiang, Peoples R China

来源：

PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ALGORITHMS, SOFTWARE ENGINEERING, AND NETWORK SECURITY, ASENS 2024 | 2024年

关键词：

Component; Large language model; Long context; Self-attention; Retrieval augment;

D O I：

10.1145/3677182.3677282

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sincerely in part to the rise of high-performance computer systems and transformer models, natural language processing has advanced. Also, a multitude of applications built on large language models continually improve people's cognitive abilities. Large language models continue to face difficulties when dealing with long context input. Many studies have suggested various specific strategies to address the challenge of extended context, however as of yet, no thorough summary of these studies exists. In this paper, we discuss the issues raised and the developments that have occurred in the long context application of large language models, and we attempt to suggest future directions for research and development.

引用

页码：563 / 566

页数：4

共 22 条

[1] Ainslie J, 2023, Arxiv, DOI [arXiv:2305.13245, 10.48550/arXiv.2305.13245, DOI 10.48550/ARXIV.2305.13245]
[2] Bai YS, 2024, Arxiv, DOI arXiv:2308.14508
[3] Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150
[4] bloc, 2023, Add NTK-Aware interpolation "by parts"correction
[5] Chen SY, 2023, Arxiv, DOI arXiv:2306.15595
[6] Chen YK, 2023, Arxiv, DOI [arXiv:2309.12307, DOI 10.48550/ARXIV.2309.12307]
[7] Child R, 2019, Arxiv, DOI arXiv:1904.10509
[8] Dai ZH, 2019, Arxiv, DOI arXiv:1901.02860
[9] emozilla, 2023, Dynamically Scaled RoPE further increases performance of long context LLaMA with zero fine-tuning
[10] Krishna K., 2023, arXiv

← 1 2 3 →