Limitations and Benefits of the ChatGPT for Python']Python Programmers and Its Tools for Evaluation

被引：1

作者：

Arias, Ricardo ^{[1
]}

Martinez, Grecia ^{[1
]}

Caceres, Didier ^{[1
]}

Garces, Eduardo ^{[1
]}

机构：

[1] Univ Tecnolog Peru, Natalio Sanchez St 125, Lima, Peru

来源：

CYBERNETICS AND CONTROL THEORY IN SYSTEMS, VOL 2, CSOC 2024 | 2024年 / 1119卷

关键词：

Artificial intelligence; chatGPT; software performance; quality Assurance;

D O I：

10.1007/978-3-031-70300-3_12

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The artificial intelligence called ChatGPT exhibits an outstanding feature, which is its ability to automatically generate code; in this case, the scope is with programming in Python. This paper focuses on evaluating the quality of the code generated and its limitations by ChatGPT compared to the code created by an advanced human programmer. To achieve this purpose, a systematic research is carried out with an exhaustive analysis of the artificial intelligence with a new tool and survey for the evaluation, in order to clarify the capabilities and constraints with its context in the field of artificial intelligence. Through a rigorous systematic review process, following PRISMA guidelines, a total of 6879 relevant publications are initially identified. After applying inclusion and exclusion criteria, the sample was reduced to 165 publications, and after eliminating irrelevant publications, a set of 15 quality articles that fit the study objectives was finally selected. The results of the articles selected in the research reveal an in-depth evaluation of ChatGPT, the understanding of the integration of Artificial Intelligence in education, the analysis of the motivations behind the use of generative chatbots, the challenges and paradigms that arise from the use of AIs in programming, the perception of AI with human attributes, the evaluation of metrics for software quality, the detection of irregularities in the code, the categorization of clones and the exploration of software quality and errors in the code. The PICOC methodology included the use of software related to quality testing, which allows achieving an astonishing 95.66% accuracy in the evaluation of the code generated, both by the programmer and by ChatGPT. This resulted in a 30.28% decrease in human errors and 96.2% effectiveness in evaluating the quality of the generated code, the evaluation considered 3 surveys for Python programmers in 1000 of advanced programmers from October to December 2023. In summary, this paper provides a complete view of the success factors in the comparison of both codes, which in turn leads to greater efficiency in software development, significantly reducing the time required by programmers with its limitation with ChatGPT.

引用

页码：171 / 194

页数：24

共 22 条

[1]

[Anonymous], 100+ Python interview questions and answers for 2023

[2] Converting data into knowledge with RCA methodology improved for inverters fault analysis [J].

Arias Velasquez, Ricardo Manuel ;

Mejia Lara, Jennifer Vanessa .

HELIYON, 2022, 8 (08)

[3] Is ChatGPT scary good? How user motivations affect creepiness and trust in generative artificial intelligence [J].

Baek, Tae Hyun ;

Kim, Minseong .

TELEMATICS AND INFORMATICS, 2023, 83

[4] Understanding metric-based detectable smells in Python']Python software: A comparative study [J].

Chen Zhifei ;

Chen Lin ;

Ma Wanwangying ;

Zhou Xiaoyu ;

Zhou Yuming ;

Xu Baowen .

INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 94 :14-29

[5] "So what if ChatGPT wrote it?" Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy [J].

Dwivedi, Yogesh K. ;

Kshetri, Nir ;

Hughes, Laurie ;

Slade, Emma Louise ;

Jeyaraj, Anand ;

Kar, Arpan Kumar ;

Baabdullah, Abdullah M. ;

Koohang, Alex ;

Raghavan, Vishnupriya ;

Ahuja, Manju ;

Albanna, Hanaa ;

Albashrawi, Mousa Ahmad ;

Al-Busaidi, Adil S. ;

Balakrishnan, Janarthanan ;

Barlette, Yves ;

Basu, Sriparna ;

Bose, Indranil ;

Brooks, Laurence ;

Buhalis, Dimitrios ;

Carter, Lemuria ;

Chowdhury, Soumyadeb ;

Crick, Tom ;

Cunningham, Scott W. ;

Davies, Gareth H. ;

Davison, Robert M. ;

De, Rahul ;

Dennehy, Denis ;

Duan, Yanqing ;

Dubey, Rameshwar ;

Dwivedi, Rohita ;

Edwards, John S. ;

Flavian, Carlos ;

Gauld, Robin ;

Grover, Varun ;

Hu, Mei-Chih ;

Janssen, Marijn ;

Jones, Paul ;

Junglas, Iris ;

Khorana, Sangeeta ;

Kraus, Sascha ;

Larsen, Kai R. ;

Latreille, Paul ;

Laumer, Sven ;

Malik, F. Tegwen ;

Mardani, Abbas ;

Mariani, Marcello ;

Mithas, Sunil ;

Mogaji, Emmanuel ;

Nord, Jeretta Horn ;

O'Connor, Siobhan .

INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2023, 71

[6] Issues-Driven features for software fault prediction [J].

Elmishali, Amir ;

Kalech, Meir .

INFORMATION AND SOFTWARE TECHNOLOGY, 2023, 155

[7] Out of the BLEU: How should we assess quality of the Code Generation models? [J].

Evtikhiev, Mikhail ;

Bogomolov, Egor ;

Sokolov, Yaroslav ;

Bryksin, Timofey .

JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 203

[8] Software-testing education: A systematic literature mapping [J].

Garousi, Vahid ;

Rainer, Austen ;

Lauvas, Per, Jr. ;

Arcuri, Andrea .

JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 165

[9] The symptoms, causes, and repairs of bugs inside a deep learning library [J].

Jia, Li ;

Zhong, Hao ;

Wang, Xiaoyin ;

Huang, Linpeng ;

Lu, Xuansheng .

JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 177 (177)

[10]

Khosravi Khashayar e., 2004, A quality model for design patterns

← 1 2 3 →