Investigating Web Corpus Filtering Methods for Language Model Development in Japanese

被引:0
作者
Enomoto, Rintaro [1 ]
Arseny, Tolmachev [2 ]
Niitsuma, Takuto [3 ]
Kurita, Shuhei [4 ]
Kawahara, Daisuke [1 ,4 ,5 ]
机构
[1] Waseda University, Japan
[2] Works Applications
[3] The Asahi Shimbun Company, Japan
[4] National Institute of Informatics, Japan
[5] LLMC, National Institute of Informatics, Japan
来源
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 | 2024年 / 4卷
关键词
Compendex;
D O I
暂无
中图分类号
学科分类号
摘要
Benchmarking
引用
收藏
页码:154 / 160
相关论文
empty
未找到相关数据