共 152 条
[1]
He K., Zhang X., Ren S., Sun J., Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, Proceedings of the IEEE International Conference on Computer Vision, pp. 1026-1034, (2015)
[2]
Shlegeris B., Roger F., Chan L., Language Models Seem to Be Much Better than Humans at Next-Token Prediction, (2022)
[3]
Villalobos P., Sevilla J., Heim L., Besiroglu T., Hobbhahn M., Ho A., Will we run out of data? An analysis of the limits of scaling datasets in Machine learning, (2022)
[4]
Hendrycks D., Mazeika M., Woodside T., An Overview of Catastrophic AI Risks, (2023)
[5]
Han X., Zhang Z., Ding N., Gu Y., Liu X., Huo Y., Qiu J., Yao Y., Zhang A., Zhang L., Et al., Pre-trained models: past, present and future, AI Open, 2, pp. 225-250, (2021)
[6]
Papadimitriou I., Jurafsky D., Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models, (2020)
[7]
Hochreiter S., Schmidhuber J., Long short-term memory, Neural Comput, 9, 8, pp. 1735-1780, (1997)
[8]
Lu K., Grover A., Abbeel P., Mordatch I., Pretrained Transformers as Universal Computation Engines, (2021)
[9]
Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I., Et al., Language models are unsupervised multitask learners, OpenAI blog, 1, 8, (2019)
[10]
Sinha K., Jia R., Hupkes D., Pineau J., Williams A., Kiela D., Masked language modeling and the distributional hypothesis: Order word matters pre-training for little, (2021)