A genetic algorithm for simulating correlated binary data from biomedical research

被引:8
|
作者
Kruppa, Jochen [1 ]
Lepenies, Bernd [2 ,3 ]
Jung, Klaus [1 ,3 ]
机构
[1] Univ Vet Med Hannover, Inst Anim Breeding & Genet, Bunteweg 17p, D-30559 Hannover, Germany
[2] Univ Vet Med Hannover, Immunol Unit, Hannover, Germany
[3] Univ Vet Med Hannover, Res Ctr Emerging Infect & Zoonoses RIZ, Hannover, Germany
关键词
Correlated binary data; Genetic algorithm; High-dimensional data; Random number generation; Computer simulation; DISTRIBUTIONS; ASSOCIATION; VARIABLES; MODELS;
D O I
10.1016/j.compbiomed.2017.10.023
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correlated binary data arise in a large variety of biomedical research. In order to evaluate methods for their analysis, computer simulations of such data are often required. Existing methods can often not cover the full range of possible correlations between the variables or are not available as implemented software. We propose a genetic algorithm that approaches the desired correlation structure under a given marginal distribution. The procedure generates a large representative matrix from which the probabilities of individual observations can be derived or from which samples can be drawn directly. Our genetic algorithm is evaluated under different specified marginal frequencies and correlation structures, and is compared against two existing approaches. The evaluation checks the speed and precision of the approach as well as its suitability for generating also high-dimensional data. In an example of high-throughput glycan array data, we demonstrate the usability of our approach to simulate the power of global test procedures. An implementation of our own and two other methods were added to the R package `RepeatedHighDim'. The presented algorithm is not restricted to certain correlation structures. In contrast to existing methods it is also evaluated for high-dimensional data.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [21] Extraction of outline in arbitrary shape from binary images using genetic algorithm
    Abe, M
    Ouchi, T
    Kawamata, M
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (02): : 32 - 46
  • [22] Research on the High Redundancy Data Compression and Storage Algorithm based on Parallel Computing and Genetic Algorithm
    Fei, Shirong
    2015 2ND INTERNATIONAL SYMPOSIUM ON ENGINEERING TECHNOLOGY, EDUCATION AND MANAGEMENT (ISETEM 2015), 2015, : 18 - 23
  • [23] Learning Bayesian Networks from Correlated Data
    Bae, Harold
    Monti, Stefano
    Montano, Monty
    Steinberg, Martin H.
    Perls, Thomas T.
    Sebastiani, Paola
    SCIENTIFIC REPORTS, 2016, 6
  • [24] The Application Research of Genetic Algorithm
    Zhang, Jumei
    PROCEEDINGS OF THE 2018 3RD INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2018), 2018, 78 : 138 - 141
  • [25] TESTS FOR TREND IN DEVELOPMENTAL TOXICITY EXPERIMENTS WITH CORRELATED BINARY DATA
    FUNG, KY
    KREWSKI, D
    RAO, JNK
    SCOTT, AJ
    RISK ANALYSIS, 1994, 14 (04) : 639 - 648
  • [26] Exact methods of testing the homogeneity of prevalences for correlated binary data
    Liu, Xiaobin
    Yang, Zhengyu
    Liu, Song
    Ma, Chang-Xing
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2017, 87 (15) : 3021 - 3039
  • [27] Research on intelligence analysis technology of financial industry data based on genetic algorithm
    Wang, Xiaojuan
    Gan, Lanshan
    Liu, Songlin
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (05) : 3391 - 3401
  • [28] Research on intelligence analysis technology of financial industry data based on genetic algorithm
    Xiaojuan Wang
    Lanshan Gan
    Songlin Liu
    The Journal of Supercomputing, 2020, 76 : 3391 - 3401
  • [29] Data Mining Research in Wireless Sensor Network Based on Genetic BP Algorithm
    Wang Mengmeng
    Xiu Debin
    Wang Rongxin
    Du Fang
    Shi Yunbo
    PROCEEDINGS OF 2013 2ND INTERNATIONAL CONFERENCE ON MEASUREMENT, INFORMATION AND CONTROL (ICMIC 2013), VOLS 1 & 2, 2013, : 243 - 247
  • [30] Efficient tests for one sample correlated binary data with applications
    Shan, Guogen
    Ma, Changxing
    STATISTICAL METHODS AND APPLICATIONS, 2014, 23 (02) : 175 - 188