Evaluation on ChatGPT for Chinese Language Understanding

被引:7
作者
Li, Linhan [1 ]
Zhang, Huaping [1 ]
Li, Chunjin [1 ]
You, Haowen [1 ]
Cui, Wenyao [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing, Peoples R China
关键词
Language Model; ChatGPT; ChatBIT; Chinese Language Understanding; Artificial intelligence;
D O I
10.1162/dint_a_00232
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
ChatGPT has attracted extension attention of academia and industry. This paper aims to evaluate ChatGPT in Chinese language understanding capability on 6 tasks using 11 datasets. Experiments indicate that ChatGPT achieved competitive results in sentiment analysis, summary, and reading comprehension in Chinese, while it is prone to factual errors in closed-book QA. Further, on two more difficult Chinese understanding tasks, that is, idiom fill-in-the-blank and cants understanding, we found that a simple chain-of-thought prompt can improve the accuracy of ChatGPT in complex reasoning. This paper further analyses the possible risks of using ChatGPT based on the results. Finally, we briefly describe the research and development progress of our ChatBIT.
引用
收藏
页码:885 / 903
页数:19
相关论文
共 43 条
  • [1] Achiam OJ, 2023, Arxiv, DOI [arXiv:2303.08774, DOI 10.48550/ARXIV.2303.08774]
  • [2] Bang Y, 2023, Arxiv, DOI arXiv:2302.04023
  • [3] ChatGPT: five priorities for research
    Bockting, Claudi
    van Dis, Eva A. M.
    Bollen, Johan
    van Rooij, Robert
    Zuidema, Willem L.
    [J]. NATURE, 2023, 614 (7947) : 224 - 226
  • [4] Brown TB, 2020, ARXIV, DOI DOI 10.48550/ARXIV.2005.14165
  • [5] Chen Mark, 2021, arXiv, DOI DOI 10.48550/ARXIV.2107.03374
  • [6] Chen WH, 2023, Arxiv, DOI arXiv:2211.12588
  • [7] Chen X., 2023, arXiv, DOI [10.48550/arXiv.2303.00293, DOI 10.48550/ARXIV.2303.00293]
  • [8] Shao CC, 2019, Arxiv, DOI arXiv:1806.00920
  • [9] Christiano P.F., 2017, Deep reinforcement learning from human preferences, P4299
  • [10] Cui YM, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P5883