Are current clinical studies on artificial intelligence-based medical devices comprehensive enough to support a full health technology assessment? A systematic review

被引:20
作者
Farah, Line [1 ,2 ]
Davaze-Schneider, Julie [3 ]
Martin, Tess [1 ,3 ]
Nguyen, Pierre [3 ]
Borget, Isabelle [1 ,4 ,5 ]
Martelli, Nicolas [1 ,3 ]
机构
[1] Univ Paris Saclay, Grp Rech & Accueil Droit & Econ Sante GRADES Dept, Orsay, France
[2] Foch Hosp, Innovat Ctr Med Devices, 40 Rue Worth, F-92150 Suresnes, France
[3] Georges Pompidou European Hosp, AP HP, Pharm Dept, 20 Rue Leblanc, F-75015 Paris, France
[4] Univ Paris Saclay, Dept Biostat & Epidemiol, Gustave Roussy, F-94805 Villejuif, France
[5] Univ Paris Saclay, Equipe Labellisee Ligue Canc, Oncostat U1018, Inserm, Villejuif, France
关键词
Artificial intelligence; Machine learning; Artificial intelligence-based medical device; Health technology assessment; Clinical trial; Economic evaluation; DIABETIC-RETINOPATHY; COST-EFFECTIVENESS; VALIDATION; ALGORITHM; SEGMENTATION; DIAGNOSIS; SOFTWARE; DESIGN; CANCER; ORGANS;
D O I
10.1016/j.artmed.2023.102547
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Introduction: Artificial Intelligence-based Medical Devices (AI-based MDs) are experiencing exponential growth in healthcare. This study aimed to investigate whether current studies assessing AI contain the information required for health technology assessment (HTA) by HTA bodies.Methods: We conducted a systematic literature review based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses methodology to extract articles published between 2016 and 2021 related to the assessment of AI-based MDs. Data extraction focused on study characteristics, technology, algorithms, com-parators, and results. AI quality assessment and HTA scores were calculated to evaluate whether the items present in the included studies were concordant with the HTA requirements. We performed a linear regression for the HTA and AI scores with the explanatory variables of the impact factor, publication date, and medical specialty. We conducted a univariate analysis of the HTA score and a multivariate analysis of the AI score with an alpha risk of 5 %.Results: Of 5578 retrieved records, 56 were included. The mean AI quality assessment score was 67 %; 32 % of articles had an AI quality score >= 70 %, 50 % had a score between 50 % and 70 %, and 18 % had a score under 50 %. The highest quality scores were observed for the study design (82 %) and optimisation (69 %) categories, whereas the scores were lowest in the clinical practice category (23 %). The mean HTA score was 52 % for all seven domains. 100 % of the studies assessed clinical effectiveness, whereas only 9 % evaluated safety, and 20 % evaluated economic issues. There was a statistically significant relationship between the impact factor and the HTA and AI scores (both p = 0.046).Discussion: Clinical studies on AI-based MDs have limitations and often lack adapted, robust, and complete ev-idence. High-quality datasets are also required because the output data can only be trusted if the inputs are reliable. The existing assessment frameworks are not specifically designed to assess AI-based MDs. From the perspective of regulatory authorities, we suggest that these frameworks should be adapted to assess the inter-pretability, explainability, cybersecurity, and safety of ongoing updates. From the perspective of HTA agencies, we highlight that transparency, professional and patient acceptance, ethical issues, and organizational changes are required for the implementation of these devices. Economic assessments of AI should rely on a robust methodology (business impact or health economic models) to provide decision-makers with more reliable evidence.Conclusion: Currently, AI studies are insufficient to cover HTA prerequisites. HTA processes also need to be adapted because they do not consider the important specificities of AI-based MDs. Specific HTA workflows and accurate assessment tools should be designed to standardise evaluations, generate reliable evidence, and create confidence.
引用
收藏
页数:13
相关论文
共 115 条
[1]  
Abi Jaoude Joseph, 2021, Oncotarget, V12, P1780, DOI [10.18632/oncotarget.28044, 10.18632/oncotarget.28044]
[2]   Comparative clinical evaluation of atlas and deep-learning-based auto-segmentation of organ structures in liver cancer [J].
Ahn, Sang Hee ;
Yeo, Adam Unjin ;
Kim, Kwang Hyeon ;
Kim, Chankyu ;
Goh, Youngmoon ;
Cho, Shinhaeng ;
Lee, Se Byeong ;
Lim, Young Kyung ;
Kim, Haksoo ;
Shin, Dongho ;
Kim, Taeyoon ;
Kim, Tae Hyun ;
Youn, Sang Hee ;
Oh, Eun Sang ;
Jeong, Jong Hwi .
RADIATION ONCOLOGY, 2019, 14 (01) :1-13
[3]   Organizational readiness for artificial intelligence in health care: insights for decision-making and practice [J].
Alami, Hassane ;
Lehoux, Pascale ;
Denis, Jean-Louis ;
Motulsky, Aude ;
Petitgand, Cecile ;
Savoldelli, Mathilde ;
Rouquet, Ronan ;
Gagnon, Marie-Pierre ;
Roy, Denis ;
Fortin, Jean-Paul .
JOURNAL OF HEALTH ORGANIZATION AND MANAGEMENT, 2021, 35 (01) :106-114
[4]   Artificial Intelligence and Health Technology Assessment: Anticipating a New Level of Complexity [J].
Alami, Hassane ;
Lehoux, Pascale ;
Auclair, Yannick ;
de Guise, Michele ;
Gagnon, Marie-Pierre ;
Shaw, James ;
Roy, Denis ;
Fleet, Richard ;
Ahmed, Mohamed Ali Ag ;
Fortin, Jean-Paul .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (07)
[5]   The Role of the FDA in Ensuring the Safety and Efficacy of Artificial Intelligence Software and Devices [J].
Allen, Bibb .
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2019, 16 (02) :208-210
[6]  
[Anonymous], GRILL DESCR FONCT DI
[7]  
[Anonymous], 1987, ARTIF INTELL
[8]  
[Anonymous], Software and AI as a medical device change programme-roadmap
[9]   Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram [J].
Attia, Zachi I. ;
Kapa, Suraj ;
Lopez-Jimenez, Francisco ;
McKie, Paul M. ;
Ladewig, Dorothy J. ;
Satam, Gaurav ;
Pellikka, Patricia A. ;
Enriquez-Sarano, Maurice ;
Noseworthy, Peter A. ;
Munger, Thomas M. ;
Asirvatham, Samuel J. ;
Scott, Christopher G. ;
Carter, Rickey E. ;
Friedman, Paul A. .
NATURE MEDICINE, 2019, 25 (01) :70-+
[10]   How to achieve trustworthy artificial intelligence for health [J].
Baeroe, Kristine ;
Miyata-Sturm, Ainar ;
Henden, Edmund .
BULLETIN OF THE WORLD HEALTH ORGANIZATION, 2020, 98 (04) :257-262