Bias and Inaccuracy in AI Chatbot Ophthalmologist Recommendations

被引:16
作者
Oca, Michael C. [1 ]
Meller, Leo [1 ]
Wilson, Katherine [1 ]
Parikh, Alomi O. [2 ]
McCoy, Allison [3 ]
Chang, Jessica [2 ]
Sudharshan, Rasika [2 ]
Gupta, Shreya [2 ]
Zhang-Nunes, Sandy [2 ]
机构
[1] Univ Calif UC San Diego Hlth, Shiley Eye Inst, Orthoped Surg, La Jolla, CA 92093 USA
[2] Univ Southern Calif, Keck Sch Med, Univ Southern Calif USC Roski Eye Inst, Ophthalmol, Los Angeles, CA USA
[3] Del Mar Plast Surg, Plast Surg, San Diego, CA USA
关键词
ai chatbot; artificial intelligence (ai) in medicine; artificial intelligence in health care; gender bias; patient education; artificial intelligence in medicine;
D O I
10.7759/cureus.45911
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Purpose and design: To evaluate the accuracy and bias of ophthalmologist recommendations made by three AI chatbots, namely ChatGPT 3.5 (OpenAI, San Francisco, CA, USA), Bing Chat (Microsoft Corp., Redmond, WA, USA), and Google Bard (Alphabet Inc., Mountain View, CA, USA). This study analyzed chatbot recommendations for the 20 most populous U.S. cities. Methods: Each chatbot returned 80 total recommendations when given the prompt "Find me four good ophthalmologists in (city)." Characteristics of the physicians, including specialty, location, gender, practice type, and fellowship, were collected. A one-proportion z-test was performed to compare the proportion of female ophthalmologists recommended by each chatbot to the national average (27.2% per the Association of American Medical Colleges (AAMC)). Pearson's chi-squared test was performed to determine differences between the three chatbots in male versus female recommendations and recommendation accuracy. Results: Female ophthalmologists recommended by Bing Chat (1.61%) and Bard (8.0%) were significantly less than the national proportion of 27.2% practicing female ophthalmologists (p<0.001, p<0.01, respectively). ChatGPT recommended fewer female (29.5%) than male ophthalmologists (p<0.722). ChatGPT (73.8%), Bing Chat (67.5%), and Bard (62.5%) gave high rates of inaccurate recommendations. Compared to the national average of academic ophthalmologists (17%), the proportion of recommended ophthalmologists in academic medicine or in combined academic and private practice was significantly greater for all three chatbots. Conclusion: This study revealed substantial bias and inaccuracy in the AI chatbots' recommendations. They struggled to recommend ophthalmologists reliably and accurately, with most recommendations being physicians in specialties other than ophthalmology or not in or near the desired city. Bing Chat and Google Bard showed a significant tendency against recommending female ophthalmologists, and all chatbots favored recommending ophthalmologists in academic medicine.
引用
收藏
页数:9
相关论文
共 30 条
  • [1] AAMC, 2021, ACT PHYS SEX SPEC
  • [2] Artificial Intelligence-Based Chatbots for Promoting Health Behavioral Changes: Systematic Review
    Aggarwal, Abhishek
    Tam, Cheuk Chi
    Wu, Dezhi
    Li, Xiaoming
    Qiao, Shan
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [3] Analysis of Sex Diversity Trends Among Ophthalmology Match Applicants, Residents, and Clinical Faculty
    Aguwa, Ugochi T.
    Srikumaran, Divya
    Green, Laura K.
    Potts, John R.
    Canner, Joseph
    Fountain, Tamara R.
    Sun, Grace
    Woreta, Fasika A.
    [J]. JAMA OPHTHALMOLOGY, 2021, 139 (11) : 1184 - 1190
  • [4] Interventions to improve outpatient referrals from primary care to secondary care
    Akbari, Ayub
    Mayhew, Alain
    Al-Alawi, Manal Alawi
    Grimshaw, Jeremy
    Winkens, Ron
    Glidewell, Elizabeth
    Pritchard, Chanie
    Thomas, Ruth
    Fraser, Cynthia
    [J]. COCHRANE DATABASE OF SYSTEMATIC REVIEWS, 2008, (04):
  • [5] American Academy of Ophthalmology, 2021, About us
  • [6] Studying health-related internet and mobile device use using web logs and smartphone records
    Bach, Ruben L.
    Wenz, Alexander
    [J]. PLOS ONE, 2020, 15 (06):
  • [7] Bohr A., 2020, Artificial Intelligence in Healthcare, P25, DOI [10.1016/b978-012-818438-7.00002-2, DOI 10.1016/B978-012-818438-7.00002-2, DOI 10.1016/B978-0-12-818438-7.00002-2]
  • [8] Do Academic Medical Centers Disproportionately Benefit The Sickest Patients?
    Burke, Laura
    Khullar, Dhruv
    Orav, E. John
    Zheng, Jie
    Frakt, Austin
    Jha, Ashish K.
    [J]. HEALTH AFFAIRS, 2018, 37 (06) : 864 - 872
  • [9] Census.gov, 2023, ABOUT US
  • [10] Chipidza Fallon E, 2015, Prim Care Companion CNS Disord, V17, DOI 10.4088/PCC.15f01840