共 22 条
- [1] Ainslie J, 2023, Arxiv, DOI [arXiv:2305.13245, 10.48550/arXiv.2305.13245, DOI 10.48550/ARXIV.2305.13245]
- [2] Bai YS, 2024, Arxiv, DOI arXiv:2308.14508
- [3] Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150
- [4] bloc, 2023, Add NTK-Aware interpolation "by parts"correction
- [5] Chen SY, 2023, Arxiv, DOI arXiv:2306.15595
- [6] Chen YK, 2023, Arxiv, DOI [arXiv:2309.12307, DOI 10.48550/ARXIV.2309.12307]
- [7] Child R, 2019, Arxiv, DOI arXiv:1904.10509
- [8] Dai ZH, 2019, Arxiv, DOI arXiv:1901.02860
- [9] emozilla, 2023, Dynamically Scaled RoPE further increases performance of long context LLaMA with zero fine-tuning
- [10] Krishna K., 2023, arXiv