A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification

被引:3
作者
Wang, Jing [1 ]
Webb-Robertson, Bobbie-Jo M. [1 ]
Matzke, Melissa M. [1 ]
Varnum, Susan M. [2 ]
Brown, Joseph N. [2 ]
Riensche, Roderick M. [1 ]
Adkins, Joshua N. [2 ]
Jacobs, Jon M. [2 ]
Hoidal, John R. [3 ]
Scholand, Mary Beth [3 ]
Pounds, Joel G. [2 ]
Blackburn, Michael R. [4 ]
Rodland, Karin D. [2 ]
McDermott, Jason E. [1 ]
机构
[1] Pacific NW Natl Lab, Richland, WA 99352 USA
[2] Pacific NW Natl Lab, Div Biol Sci, Richland, WA 99352 USA
[3] Univ Utah, Sch Med, Dept Internal Med, Salt Lake City, UT 84132 USA
[4] Univ Texas Houston, Sch Med, Dept Biochem & Mol Biol, Houston, TX 77030 USA
基金
美国国家卫生研究院;
关键词
SELECTION; TOOL; INFLAMMATION; INTENSITIES; IMPUTATION; DISCOVERY; STRATEGY; LUNG; MICE;
D O I
10.1155/2013/613529
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background. The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective. To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods. The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expert knowledge was integrated into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results. The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions. Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification.
引用
收藏
页码:513 / 523
页数:11
相关论文
共 48 条
[1]   FEATURE SELECTION IN OMICS PREDICTION PROBLEMS USING CAT SCORES AND FALSE NONDISCOVERY RATE CONTROL [J].
Ahdesmaeki, Miika ;
Strimmer, Korbinian .
ANNALS OF APPLIED STATISTICS, 2010, 4 (01) :503-519
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Estimating the posterior probabilities using the K-nearest neighbor rule [J].
Atiya, AF .
NEURAL COMPUTATION, 2005, 17 (03) :731-740
[4]   VIBE 2.0: Visual Integration for Bayesian Evaluation [J].
Beagley, Nathaniel ;
Stratton, Kelly G. ;
Webb-Robertson, Bobbie-Jo M. .
BIOINFORMATICS, 2010, 26 (02) :280-282
[5]   Metabolic consequences of adenosine deaminase deficiency in mice are associated with defects in alveogenesis, pulmonary inflammation, and airway obstruction [J].
Blackburn, MR ;
Volmer, JB ;
Thrasher, JL ;
Zhong, HY ;
Crosby, JR ;
Lee, JJ ;
Kellems, RE .
JOURNAL OF EXPERIMENTAL MEDICINE, 2000, 192 (02) :159-170
[6]   Adenosine mediates IL-13-induced inflammation and remodeling in the lung and interacts in an IL-13-adenosine amplification pathway [J].
Blackburn, MR ;
Lee, CG ;
Young, HWJ ;
Zhu, Z ;
Chunn, JL ;
Kang, MJ ;
Banerjee, SK ;
Elias, JA .
JOURNAL OF CLINICAL INVESTIGATION, 2003, 112 (03) :332-344
[7]   Adenosine deaminase-deficient mice generated using a two-stage genetic engineering strategy exhibit a combined immunodeficiency [J].
Blackburn, MR ;
Datta, SK ;
Kellems, RE .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1998, 273 (09) :5093-5100
[8]   Finding biomarkers is getting easier [J].
Bradley, Brian Patrick .
ECOTOXICOLOGY, 2012, 21 (03) :631-636
[9]  
CHEN L, 2007, P INT C MACH LEARN A, P560
[10]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)