A Quantitative Analysis of State Space Model-Based Large Language Model: Study of Hungry Hungry Hippos

被引：0

作者：

Yoon, Dongho ^{[1
]}

Kim, Taehun ^{[1
]}

Lee, Jae W. ^{[2
]}

Rhu, Minsoo ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Daejeon 34141, South Korea

[2] Seoul Natl Univ, Seoul 08826, South Korea

来源：

IEEE COMPUTER ARCHITECTURE LETTERS | 2024年 / 23卷 / 02期

关键词：

Computational modeling; Convolution; Mathematical models; Graphics processing units; Memory management; Computational complexity; Vectors; GPU; h3; large language models; state space model;

D O I：

10.1109/LCA.2024.3422492

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As the need for processing long contexts in large language models (LLMs) increases, attention-based LLMs face significant challenges due to their high computation and memory requirements. To overcome this challenge, there have been several recent works that seek to alleviate attention's system-level bottlenecks. An approach that has been receiving a lot of attraction lately is state space models (SSMs) thanks to their ability to substantially reduce computational complexity and memory footprint. Despite the excitement around SSMs, there is a lack of an in-depth characterization and analysis on this important model architecture. In this paper, we delve into a representative SSM named Hungry Hungry Hippos (H3), examining its advantages as well as its current limitations. We also discuss future research directions on improving the efficiency of SSMs via hardware architectural support.

引用

页码：154 / 157

页数：4

共 15 条

[1]

Child R, 2019, Arxiv, DOI [arXiv:1904.10509, DOI 10.48550/ARXIV.1904.10509, 10.48550/arXiv.1904.10509]

[2]

Dao T, 2022, ADV NEUR IN

[3]

Dettmers T, 2022, ADV NEUR IN

[4]

Ding J., 2023, arXiv, DOI [DOI 10.48550/ARXIV.2307.02486, 10.48550/arXiv.2307.02486]

[5]

Frantar D., 2023, P OFMACHINE LEARNING, P10323, DOI DOI 10.48550/ARXIV.2301.007742

[6]

Fu D.Y., 2023, P INT C LEARN REPR

[7]

Google, 2024, arXiv

[8]

Gu A., 2022, P INT C LEARN REPR

[9]

Gu A, 2024, Arxiv, DOI [arXiv:2312.00752, 10.48550/arXiv.2312.00752, DOI 10.48550/ARXIV.2312.00752]

[10]

Liu H., 2023, Proc. Adv. Neural Inf. Process. Syst., V36, P8828

← 1 2 →