A Novel adaptive multiple imputation algorithm

被引：0

作者：

Computer Systems and Technologies Department, Technical University of Sofia, branch Plovdiv, Tsanko Dyustabanov 25, 4000 Plovdiv, Bulgaria ^{[1
]}

不详 ^{[2
]}

机构：

[1] Computer Systems and Technologies Department, Technical University of Sofia, branch Plovdiv, 4000 Plovdiv

[2] Innovation Center - East Flanders, House of Economy, B-9000 Ghent

来源：

Commun. Comput. Info. Sci. | 2008年 / 193-206期

关键词：

19;

D O I：

10.1007/978-3-540-70600-7_15

中图分类号：

学科分类号：

摘要：

The accurate estimation of missing values is important for efficient use of DNA microarray data since most of the analysis and clustering algorithms require a complete data matrix. Several imputation algorithms have already been proposed in the biological literature. Most of these approaches identify, in one or another way, a fixed number of neighbouring genes for the estimation of each missing value. This increases the possibility of involving in the evaluation process gene expression profiles, which are rather distant from the profile of the target gene. The latter may significantly affect the performance of the applied imputation algorithm. We propose in this article a novel adaptive multiple imputation algorithm, which uses a varying number of neighbouring genes for the estimation of each missing value. The algorithm generates for each missing value a list of multiple candidate estimation values and then selects the most suitable one, according to some well-defined criteria, in order to replace the missing entry. The similarity between the expression profiles can be estimated either with the Euclidean metric or with the Dynamic Time Warping (DTW) distance measure. In this way, the proposed algorithm can be applied for the imputation of missing values for both non-time series and time series data. © Springer-Verlag Berlin Heidelberg 2008.

引用

页码：193 / 206

页数：13