Identification of Most Relevant Features for Classification of Francisella tularensis Using Machine Learning

被引:12
作者
Ahmad, Fareed [1 ,2 ]
Farooq, Amjad [1 ]
Khan, Muhammad Usman Ghani [1 ]
Shabbir, Muhammad Zubair [2 ]
Rabbani, Masood [2 ]
Hussain, Irshad [2 ]
机构
[1] Univ Engn & Technol, Fac Elect Engn, Dept Comp Sci & Engn, Lahore, Pakistan
[2] Univ Vet & Anim Sci, Inst Microbiol, Lahore, Pakistan
关键词
Francisella tularensis; feature ranking; pathogen classification; multilayer perceptron; persistence of Francisella tularensis; soil-borne pathogen; risk factors; biological weapon; YIELD PREDICTION; SOIL PROPERTIES; RANDOM FORESTS; BACTERIAL; TULAREMIA; SURVIVAL; PH; COMMUNITIES; POPULATIONS; DIVERSITY;
D O I
10.2174/1574893615666200219113900
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Francisella tularensis is a stealth pathogen fatal for animals and humans. Ease of its propagation, coupled with high capacity for ailment and death makes it a potential candidate for biological weapon. Objective: Work related to the pathogen's classification and factors affecting its prolonged existence in soil is limited to statistical measures. Machine learning other than conventional analysis methods may be applied to better predict epidemiological modeling for this soil-borne pathogen. Methods: Feature-ranking algorithms namely; relief, correlation and oneR are used for soil attribute ranking. Moreover, classification algorithms; SVM, random forest, naive bayes, logistic regression and MLP are used for classification of the soil attribute dataset for Francisella tularensis positive and negative soils. Results: Feature-ranking methods concluded that clay, nitrogen, organic matter, soluble salts, zinc, silt and nickel are the most significant attributes while potassium, phosphorous, iron, calcium, copper, chromium and sand are the least contributing risk factors for the persistence of the pathogen. However, clay is the most significant and potassium is the least contributing attribute. Data analysis suggests that feature-ranking using relief produced classification accuracy of 84.35% for multilayer perceptron; 82.99% for linear regression; 80.27% for SVM and random forest; and 78.23% for naive bayes, which is better than other ranking methods. MLP outperforms other classifiers by generating an accuracy of 84.35%, 82.99% and 81.63% for feature-ranking using relief, correlation and oneR algorithms, respectively. Conclusion: These models can significantly improve accuracy and can minimize the risk of incorrect classification. They further help in controlling epidemics and thereby minimizing the socio-economic impact on the society.
引用
收藏
页码:1197 / 1212
页数:16
相关论文
共 108 条
[21]  
de Carvalho IL, 2007, EMERG INFECT DIS, V13, P666
[22]   Tularemia as a biological weapon - Medical and public health management [J].
Dennis, DT ;
Inglesby, TV ;
Henderson, DA ;
Bartlett, JG ;
Ascher, MS ;
Eitzen, E ;
Fine, AD ;
Friedlander, AM ;
Hauer, J ;
Layton, M ;
Lillibridge, SR ;
McDade, JE ;
Osterholm, MT ;
O'Toole, T ;
Parker, G ;
Perl, TM ;
Russell, PK ;
Tonat, K .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2001, 285 (21) :2763-2773
[23]   Association of Different Genetic Types of Francisella-Like Organisms with the Rocky Mountain Wood Tick (Dermacentor andersoni) and the American Dog Tick (Dermacentor variabilis) in Localities Near Their Northern Distributional Limits [J].
Dergousoff, Shaun J. ;
Chilton, Neil B. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2012, 78 (04) :965-971
[24]  
Dhar S, 2011, INT JOINT C NEUR NET, DOI DOI 10.1109/IJCNN.2011.6033280
[25]  
Effendi Z., 2010, American Journal of Applied Sciences, V7, P390, DOI 10.3844/ajassp.2010.390.394
[26]   Examination of factors for use as potential predictors of human enteric pathogen survival in soil [J].
Erickson, M. C. ;
Habteselassie, M. Y. ;
Liao, J. ;
Webb, C. C. ;
Mantripragada, V. ;
Davey, L. E. ;
Doyle, M. P. .
JOURNAL OF APPLIED MICROBIOLOGY, 2014, 116 (02) :335-349
[27]  
EstradaPena A, 2013, TICKS AND TICK-BORNE DISEASES: GEOGRAPHICAL DISTRIBUTION AND CONTROL STRATEGIES IN THE EURO-ASIA REGION, P1
[28]  
European centre for disease prevention and control, 2019, ECDC ANN EP REP 2016
[29]  
Fatima M., 2017, J. Intell. Learn. Syst. Appl., V09, P1, DOI [10.4236/jilsa.2017.91001, DOI 10.4236/JILSA.2017.91001]
[30]   An outbreak of primary pneumonic tularemia on martha's vineyard [J].
Feldman, KA ;
Enscore, RE ;
Lathrop, SL ;
Matyas, BT ;
McGuill, M ;
Schriefer, ME ;
Stiles-Enos, D ;
Dennis, DT ;
Petersen, LR ;
Hayes, EB .
NEW ENGLAND JOURNAL OF MEDICINE, 2001, 345 (22) :1601-1606