KNIME for reproducible cross-domain analysis of life science data

被引:121
作者
Fillbrunn, Alexander [1 ,5 ]
Dietz, Christian [1 ]
Pfeuffer, Julianus [2 ]
Rahn, Rene [3 ]
Landrum, Gregory A. [4 ]
Berthold, Michael R. [1 ,4 ,5 ]
机构
[1] Univ Konstanz, Univ Str 10, D-78457 Constance, Germany
[2] Eberhard Karls Univ Tubingen, Geschwister Scholl Pl, D-72074 Tubingen, Germany
[3] Free Univ Berlin, Kaiserswerther Str 16-18, D-14195 Berlin, Germany
[4] KNIME, Technopk Str 1, CH-8005 Zurich, Switzerland
[5] Konstanz Res Sch Chem Biol, Constance, Germany
关键词
KNIME; Workflow systems; Life science; HIGH-THROUGHPUT; MASS; EXPRESSION; PLATFORM; OPENMS; TANDEM;
D O I
10.1016/j.jbiotec.2017.07.028
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Experiments in the life sciences often involve tools from a variety of domains such as mass spectrometry, next generation sequencing, or image processing. Passing the data between those tools often involves complex scripts for controlling data flow, data transformation, and statistical analysis. Such scripts are not only prone to be platform dependent, they also tend to grow as the experiment progresses and are seldomly well documented, a fact that hinders the reproducibility of the experiment. Workflow systems such as KNIME Analytics Platform aim to solve these problems by providing a platform for connecting tools graphically and guaranteeing the same results on different operating systems. As an open source software, KNIME allows scientists and programmers to provide their own extensions to the scientific community. In this review paper we present selected extensions from the life sciences that simplify data exploration, analysis, and visualization and are interoperable due to KNIME's unified data model. Additionally, we name other workflow systems that are commonly used in the life sciences and highlight their similarities and differences to KNIME.
引用
收藏
页码:149 / 156
页数:8
相关论文
共 46 条
[1]   Cooperativity among Rev-Associated Nuclear Export Signals Regulates HIV-1 Gene Expression and Is a Determinant of Virus Species Tropism [J].
Aligeti, Mounavya ;
Behrens, Ryan T. ;
Pocock, Ginger M. ;
Schindelin, Johannes ;
Dietz, Christian ;
Eliceiri, Kevin W. ;
Swanson, Chad M. ;
Malim, Michael H. ;
Ahlquist, Paul ;
Sherer, Nathan M. .
JOURNAL OF VIROLOGY, 2014, 88 (24) :14207-14221
[2]  
Allan C, 2012, NAT METHODS, V9, P245, DOI [10.1038/NMETH.1896, 10.1038/nmeth.1896]
[3]  
Berthold MichaelR., 2007, STUDIES CLASSIFICATI
[4]   General Statistical Modeling of Data from Protein Relative Expression Isobaric Tags [J].
Breitwieser, Florian P. ;
Mueller, Andre ;
Dayon, Loic ;
Koecher, Thomas ;
Hainard, Alexandre ;
Pichler, Peter ;
Schmidt-Erfurth, Ursula ;
Superti-Furga, Giulio ;
Sanchez, Jean-Charles ;
Mechtler, Karl ;
Bennett, Keiryn L. ;
Colinge, Jacques .
JOURNAL OF PROTEOME RESEARCH, 2011, 10 (06) :2758-2766
[5]   CellProfiler: image analysis software for identifying and quantifying cell phenotypes [J].
Carpenter, Anne E. ;
Jones, Thouis Ray ;
Lamprecht, Michael R. ;
Clarke, Colin ;
Kang, In Han ;
Friman, Ola ;
Guertin, David A. ;
Chang, Joo Han ;
Lindquist, Robert A. ;
Moffat, Jason ;
Golland, Polina ;
Sabatini, David M. .
GENOME BIOLOGY, 2006, 7 (10)
[6]   MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments [J].
Choi, Meena ;
Chang, Ching-Yun ;
Clough, Timothy ;
Broudy, Daniel ;
Killeen, Trevor ;
MacLean, Brendan ;
Vitek, Olga .
BIOINFORMATICS, 2014, 30 (17) :2524-2526
[7]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[8]  
Curcin V., 2008, CIBEC 08 PROC CAIRO, P1
[9]  
Dadi T.H., 2016, SLIMM SPECIES LEVEL
[10]  
Demsar J, 2013, J MACH LEARN RES, V14, P2349