The Impact of Bias on Drift Detection in AI Health Software

被引：1

作者：

Azar, Asal Khoshravan ^{[1
]}

Draghi, Barbara ^{[2
]}

Rotalinti, Ylenia ^{[2
]}

Myles, Puja ^{[2
]}

Tucker, Allan ^{[1
]}

机构：

[1] Brunel Univ London, Uxbridge UB8 3PH, Middx, England

[2] Med & Healthcare Prod Regulatory Agcy, London E14 4PU, England

来源：

ARTIFICIAL INTELLIGENCE IN MEDICINE, AIME 2023 | 2023年 / 13897卷

基金：

“创新英国”项目;

关键词：

Concept Drift; Data Bias; Healthcare models;

D O I：

10.1007/978-3-031-34344-5_37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Despite the potential of AI in healthcare decision-making, there are also risks to the public for different reasons. Bias is one risk: any data unfairness present in the training set, such as the underrepresentation of certain minority groups, will be reflected by the model resulting in inaccurate predictions. Data drift is another concern: models trained on obsolete data will perform poorly on newly available data. Approaches to analysing bias and data drift independently are already available in the literature, allowing researchers to develop inclusive models or models that are up-to-date. However, the two issues can interact with each other. For instance, drifts within under-represented subgroups might be masked when assessing a model on the whole population. To ensure the deployment of a trustworthy model, we propose that it is crucial to evaluate its performance both on the overall population and across under-represented cohorts. In this paper, we explore a methodology to investigate the presence of drift that may only be evident in sub-populations in two protected attributes, i.e., ethnicity and gender. We use the BayesBoost technique to capture under-represented individuals and to boost these cases by inferring cases from a Bayesian network. Lastly, we evaluate the capability of this technique to handle some cases of drift detection across different sub-populations.

引用

页码：313 / 322

页数：10

共 15 条

[1]

[Anonymous], 2012, The Clinical Practice Research Database

[2]

[Anonymous], CPRD cardiovascular disease synthetic dataset

[3]

[Anonymous], Software and AI as a medical device programme

[4] Learning in Nonstationary Environments: A Survey [J].

Ditzler, Gregory ;

Roveri, Manuel ;

Alippi, Cesare ;

Polikar, Robi .

IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2015, 10 (04) :12-25

[5]

Draghi B, 2021, PR MACH LEARN RES, V154, P49

[6] A Survey on Concept Drift Adaptation [J].

Gama, Joao ;

Zliobaite, Indre ;

Bifet, Albert ;

Pechenizkiy, Mykola ;

Bouchachia, Abdelhamid .

ACM COMPUTING SURVEYS, 2014, 46 (04)

[7] Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data [J].

Gianfrancesco, Milena A. ;

Tamang, Suzanne ;

Yazdany, Jinoos ;

Schmajuk, Gabriela .

JAMA INTERNAL MEDICINE, 2018, 178 (11) :1544-1547

[8] Race/Ethnic Differences in the Associations of the Framingham Risk Factors with Carotid IMT and Cardiovascular Events [J].

Gijsberts, Crystel M. ;

Groenewegen, Karlijn A. ;

Hoefer, Imo E. ;

Eijkemans, Marinus J. C. ;

Asselbergs, Folkert W. ;

Anderson, Todd J. ;

Britton, Annie R. ;

Dekker, Jacqueline M. ;

Engstrom, Gunnar ;

Evans, Greg W. ;

de Graaf, Jacqueline ;

Grobbee, Diederick E. ;

Hedblad, Bo ;

Holewijn, Suzanne ;

Ikeda, Ai ;

Kitagawa, Kazuo ;

Kitamura, Akihiko ;

de Kleijn, Dominique P. V. ;

Lonn, Eva M. ;

Lorenz, Matthias W. ;

Mathiesen, Ellisiv B. ;

Nijpels, Giel ;

Okazaki, Shuhei ;

O'Leary, Daniel H. ;

Pasterkamp, Gerard ;

Peters, Sanne A. E. ;

Polak, Joseph F. ;

Price, Jacqueline F. ;

Robertson, Christine ;

Rembold, Christopher M. ;

Rosvall, Maria ;

Rundek, Tatjana ;

Salonen, Jukka T. ;

Sitzer, Matthias ;

Stehouwer, Coen D. A. ;

Bots, Michiel L. ;

den Ruijter, Hester M. .

PLOS ONE, 2015, 10 (07)

[9] Gender bias in medicine [J].

Hamberg, Katarina .

WOMENS HEALTH, 2008, 4 (03) :237-243

[10]

Krieger N., 2020, Women's Health, Politics, and Power: Essays on Sex/Gender, Medicine, and Public Health, P11

← 1 2 →