Machine Learning Methods for the Diagnosis of Chronic Obstructive Pulmonary Disease in Healthy Subjects: Retrospective Observational Cohort Study

被引:13
作者
Muro, Shigeo [1 ]
Ishida, Masato [2 ]
Horie, Yoshiharu [3 ]
Takeuchi, Wataru [4 ]
Nakagawa, Shunki [4 ]
Ban, Hideyuki [4 ]
Nakagawa, Tohru [5 ]
Kitamura, Tetsuhisa [6 ]
机构
[1] Nara Med Univ, Dept Resp Med, Nara, Japan
[2] AstraZeneca KK, Dept Resp & Immunol, Med, Osaka, Japan
[3] AstraZeneca KK, Dept Data Sci, Med, Osaka, Japan
[4] Hitachi Ltd, Ctr Technol Innovat Artificial Intelligence, Res & Dev Grp, Tokyo, Japan
[5] Hitachi Ltd, Hitachi Hlth Care Ctr, Ibaraki, Japan
[6] Osaka Univ, Grad Sch Med, Dept Social & Environm Med, Div Environm Med & Populat Sci, Osaka, Japan
关键词
chronic obstructive pulmonary disease; airflow limitation; medical check-up; Gradient Boosting Decision Tree; logistic regression; AIR-FLOW OBSTRUCTION; NATURAL-HISTORY; COPD; SMOKING; POPULATION; PREVALENCE; IMPACT; ASSOCIATION; PROGNOSIS; OUTCOMES;
D O I
10.2196/24796
中图分类号
R-058 [];
学科分类号
摘要
Background: Airflow limitation is a critical physiological feature in chronic obstructive pulmonary disease (COPD), for which long-term exposure to noxious substances, including tobacco smoke, is an established risk. However, not all long-term smokers develop COPD, meaning that other risk factors exist. Objective: This study aimed to predict the risk factors for COPD diagnosis using machine learning in an annual medical check-up database. Methods: In this retrospective observational cohort study (ARTDECO [Analysis of Risk Factors to Detect COPD]), annual medical check-up records for all Hitachi Ltd employees in Japan collected from April 1998 to March 2019 were analyzed. Employees who provided informed consent via an opt-out model were screened and those aged 30 to 75 years without a prior diagnosis of COPD/asthma or a history of cancer were included. The database included clinical measurements (eg, pulmonary function tests) and questionnaire responses. To predict the risk factors for COPD diagnosis within a 3-year period, the Gradient Boosting Decision Tree machine learning (XGBoost) method was applied as a primary approach, with logistic regression as a secondary method. A diagnosis of COPD was made when the ratio of the prebronchodilator forced expiratory volume in 1 second (FEV1) to prebronchodilator forced vital capacity (FVC) was <0.7 during two consecutive examinations. Results: Of the 26,101 individuals screened, 1213 met the exclusion criteria, and thus, 24,815 individuals were included in the analysis. The top 10 predictors for COPD diagnosis were FEV1/FVC, smoking status, allergic symptoms, cough, pack years, hemoglobin A1c, serum albumin, mean corpuscular volume, percent predicted vital capacity, and percent predicted value of FEV1. The areas under the receiver operating characteristic curves of the XGBoost model and the logistic regression model were 0.956 and 0.943, respectively. Conclusions: Using a machine learning model in this longitudinal database, we identified a number of parameters as risk factors other than smoking exposure or lung function to support general practitioners and occupational health physicians to predict the development of COPD. Further research to confirm our results is warranted, as our analysis involved a database used only in Japan.
引用
收藏
页数:13
相关论文
共 35 条
[1]   Persistent Systemic Inflammation is Associated with Poor Clinical Outcomes in COPD: A Novel Phenotype [J].
Agusti, Alvar ;
Edwards, Lisa D. ;
Rennard, Stephen I. ;
MacNee, William ;
Tal-Singer, Ruth ;
Miller, Bruce E. ;
Vestbo, Jorgen ;
Lomas, David A. ;
Calverley, Peter M. A. ;
Wouters, Emiel ;
Crim, Courtney ;
Yates, Julie C. ;
Silverman, Edwin K. ;
Coxson, Harvey O. ;
Bakke, Per ;
Mayer, Ruth J. ;
Celli, Bartolome .
PLOS ONE, 2012, 7 (05)
[2]   COPD prognosis in relation to diagnostic criteria for airflow obstruction in smokers [J].
Akkermans, Reinier P. ;
Biermans, Marion ;
Robberts, Bas ;
ter Riet, Gerben ;
Jacobs, Annelies ;
van Weel, Chris ;
Wensing, Michel ;
Schermer, Tjard .
EUROPEAN RESPIRATORY JOURNAL, 2014, 43 (01) :54-63
[3]   Smoking cessation affects the natural history of COPD [J].
Bai, Jiu-Wu ;
Chen, Xiao-xin ;
Liu, Shengsheng ;
Yu, Li ;
Xu, Jin-Fu .
INTERNATIONAL JOURNAL OF CHRONIC OBSTRUCTIVE PULMONARY DISEASE, 2017, 12 :3323-3328
[4]   Platelet count, mean platelet volume and smoking status in stable chronic obstructive pulmonary disease [J].
Biljak, Vanja Radisic ;
Pancirov, Dolores ;
Cepelak, Ivana ;
Popovic-Grle, Sanja ;
Stjepanovic, Gordana ;
Grubisic, Tihana Zanic .
PLATELETS, 2011, 22 (06) :466-470
[5]   Childhood predictors of lung function trajectories and future COPD risk: a prospective cohort study from the first to the sixth decade of life [J].
Bui, Dinh S. ;
Lodge, Caroline J. ;
Burgess, John A. ;
Lowe, Adrian J. ;
Perret, Jennifer ;
Bui, Minh Q. ;
Bowatte, Gayan ;
Gurrin, Lyle ;
Johns, David P. ;
Thompson, Bruce R. ;
Hamilton, Garun S. ;
Frith, Peter A. ;
James, Alan L. ;
Thomas, Paul S. ;
Jarvis, Deborah ;
Svanes, Cecilie ;
Russell, Melissa ;
Morrison, Stephen C. ;
Feather, Iain ;
Allen, Katrina J. ;
Wood-Baker, Richard ;
Hopper, John ;
Giles, Graham G. ;
Abramson, Michael J. ;
Walters, Eugene H. ;
Matheson, Melanie C. ;
Dharmage, Shyamali C. .
LANCET RESPIRATORY MEDICINE, 2018, 6 (07) :535-544
[6]   Standards for the diagnosis and treatment of patients with COPD: a summary of the ATS/ERS position paper [J].
Celli, BR ;
MacNee, W ;
Agusti, A ;
Anzueto, A ;
Berg, B ;
Buist, AS ;
Calverley, PMA ;
Chavannes, N ;
Dillard, T ;
Fahy, B ;
Fein, A ;
Heffner, J ;
Lareau, S ;
Meek, P ;
Martinez, F ;
McNicholas, W ;
Muris, J ;
Austegard, E ;
Pauwels, R ;
Rennard, S ;
Rossi, A ;
Siafakas, N ;
Tiep, B ;
Vestbo, J ;
Wouters, E ;
ZuWallack, R .
EUROPEAN RESPIRATORY JOURNAL, 2004, 23 (06) :932-946
[7]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[8]   Effects of obstructive sleep apnea syndrome on serum aminotransferase levels in obese patients [J].
Chin, K ;
Nakamura, T ;
Takahashi, K ;
Sumi, K ;
Ogawa, Y ;
Masuzaki, H ;
Muro, S ;
Hattori, N ;
Matsumoto, H ;
Niimi, A ;
Chiba, T ;
Nakao, K ;
Mishima, M ;
Ohi, M ;
Nakamura, T .
AMERICAN JOURNAL OF MEDICINE, 2003, 114 (05) :370-376
[9]  
Collins GS, 2015, ANN INTERN MED, V162, P55, DOI [10.1186/s12916-014-0241-z, 10.7326/M14-0698, 10.1016/j.jclinepi.2014.11.010, 10.7326/M14-0697, 10.1016/j.eururo.2014.11.025, 10.1002/bjs.9736, 10.1038/bjc.2014.639, 10.1136/bmj.g7594, 10.1111/eci.12376]
[10]   Risk Factors for Chronic Obstructive Pulmonary Disease in a European Cohort of Young Adults [J].
de Marco, Roberto ;
Accordini, Simone ;
Marcon, Alessandro ;
Cerveri, Isa ;
Anto, Josep M. ;
Gislason, Thorarinn ;
Heinrich, Joachim ;
Janson, Christer ;
Jarvis, Deborah ;
Kuenzli, Nino ;
Leynaert, Benedicte ;
Sunyer, Jordi ;
Svanes, Cecilie ;
Wjst, Matthias ;
Burney, Peter .
AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2011, 183 (07) :891-897