Harnessing the Power of ChatGPT for Automating Systematic Review Process: Methodology, Case Study, Limitations, and Future Directions

被引：77

作者：

Alshami, Ahmad ^{[1
]}

Elsayed, Moustafa ^{[2
]}

Ali, Eslam ^{[3
,4
]}

Eltoukhy, Abdelrahman E. E. ^{[5
]}

Zayed, Tarek ^{[3
]}

机构：

[1] Florida State Univ, FAMU FSU Coll Engn, Dept Civil & Environm Engn, Tallahassee, FL 32013 USA

[2] Florida A&M Univ, FAMU FSU Coll Engn, Dept Civil & Environm Engn, Tallahassee, FL 32013 USA

[3] Hong Kong Polytech Univ, Fac Construct & Environm, Dept Bldg & Real Estate, Kowloon TU428, Hong Kong, Peoples R China

[4] Cairo Univ, Fac Engn, Publ Works Dept, Geomat Lab, Giza 12613, Egypt

[5] Hong Kong Polytech Univ, Dept Ind & Syst Engn, Hung Hom TU428, Hong Kong, Peoples R China

来源：

SYSTEMS | 2023年 / 11卷 / 07期

关键词：

ChatGPT; systematic review; automation; Internet of Things (IoT); article filtration; article categorization; information extraction; content analysis; LARGE LANGUAGE MODELS; HEALTH-CARE;

D O I：

10.3390/systems11070351

中图分类号：

C [社会科学总论];

学科分类号：

03 ; 0303 ;

摘要：

Systematic reviews (SR) are crucial in synthesizing and analyzing existing scientific literature to inform evidence-based decision-making. However, traditional SR methods often have limitations, including a lack of automation and decision support, resulting in time-consuming and error-prone reviews. To address these limitations and drive the field forward, we harness the power of the revolutionary language model, ChatGPT, which has demonstrated remarkable capabilities in various scientific writing tasks. By utilizing ChatGPT's natural language processing abilities, our objective is to automate and streamline the steps involved in traditional SR, explicitly focusing on literature search, screening, data extraction, and content analysis. Therefore, our methodology comprises four modules: (1) Preparation of Boolean research terms and article collection, (2) Abstract screening and articles categorization, (3) Full-text filtering and information extraction, and (4) Content analysis to identify trends, challenges, gaps, and proposed solutions. Throughout each step, our focus has been on providing quantitative analyses to strengthen the robustness of the review process. To illustrate the practical application of our method, we have chosen the topic of IoT applications in water and wastewater management and quality monitoring due to its critical importance and the dearth of comprehensive reviews in this field. The findings demonstrate the potential of ChatGPT in bridging the gap between traditional SR methods and AI language models, resulting in enhanced efficiency and reliability of SR processes. Notably, ChatGPT exhibits exceptional performance in filtering and categorizing relevant articles, leading to significant time and effort savings. Our quantitative assessment reveals the following: (1) the overall accuracy of ChatGPT for article discarding and classification is 88%, and (2) the F-1 scores of ChatGPT for article discarding and classification are 91% and 88%, respectively, compared to expert assessments. However, we identify limitations in its suitability for article extraction. Overall, this research contributes valuable insights to the field of SR, empowering researchers to conduct more comprehensive and reliable reviews while advancing knowledge and decision-making across various domains.

引用

页数：37

共 53 条

[31]

Moher D, 2009, PLOS MED, V6, DOI [10.1371/journal.pmed.1000097, 10.1186/2046-4053-4-1, 10.1136/bmj.i4086, 10.1016/j.ijsu.2010.02.007, 10.1136/bmj.b2700, 10.1136/bmj.b2535, 10.1016/j.ijsu.2010.07.299]

[32] SYSTEMATIC REVIEWS - RATIONALE FOR SYSTEMATIC REVIEWS .1. [J].