Classifying smoking urges via machine learning

被引:25
作者
Dumortier, Antoine [1 ]
Beckjord, Ellen [2 ]
Shiffman, Saul [3 ]
Sejdic, Ervin [1 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Benedum Hall, Pittsburgh, PA 15260 USA
[2] Univ Pittsburgh, Dept Psychiat, 5115 Ctr Ave,Suite 140, Pittsburgh, PA 15232 USA
[3] Univ Pittsburgh, Dept Psychol, 510 BELPB,130 N Bellefield Ave, Pittsburgh, PA 15260 USA
基金
美国国家卫生研究院;
关键词
Smoking urges; Smoking cessation; Machine learning; Supervised learning; Feature selection; CLASSIFICATION; CESSATION; SELECTION;
D O I
10.1016/j.cmpb.2016.09.016
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background and objective: Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. Methods: To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. Results: The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. Conclusions: In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms' performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. (C) 2016 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:203 / 213
页数:11
相关论文
共 51 条
[1]  
Agaku IT, 2014, MMWR-MORBID MORTAL W, V63, P29
[2]  
Alwan A, 2011, WHO REPORT ON THE GLOBAL TOBACCO EPIDEMIC, 2011: WARNING ABOUT THE DANGERS OF TOBACCO, P7
[3]  
[Anonymous], 2008, Introduction to information retrieval
[4]  
[Anonymous], 2011, Morbidity and Mortality Weekly Report, V60, P1513
[5]  
[Anonymous], 1990, Introduction to statistical pattern recognition
[6]  
[Anonymous], 2014, Centers for Disease Control and Prevention
[7]   A survey of cross-validation procedures for model selection [J].
Arlot, Sylvain ;
Celisse, Alain .
STATISTICS SURVEYS, 2010, 4 :40-79
[8]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[9]  
Breiman L, 1996, ANN STAT, V24, P2350
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32