A genetic algorithm for simulating correlated binary data from biomedical research

被引:8
|
作者
Kruppa, Jochen [1 ]
Lepenies, Bernd [2 ,3 ]
Jung, Klaus [1 ,3 ]
机构
[1] Univ Vet Med Hannover, Inst Anim Breeding & Genet, Bunteweg 17p, D-30559 Hannover, Germany
[2] Univ Vet Med Hannover, Immunol Unit, Hannover, Germany
[3] Univ Vet Med Hannover, Res Ctr Emerging Infect & Zoonoses RIZ, Hannover, Germany
关键词
Correlated binary data; Genetic algorithm; High-dimensional data; Random number generation; Computer simulation; DISTRIBUTIONS; ASSOCIATION; VARIABLES; MODELS;
D O I
10.1016/j.compbiomed.2017.10.023
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correlated binary data arise in a large variety of biomedical research. In order to evaluate methods for their analysis, computer simulations of such data are often required. Existing methods can often not cover the full range of possible correlations between the variables or are not available as implemented software. We propose a genetic algorithm that approaches the desired correlation structure under a given marginal distribution. The procedure generates a large representative matrix from which the probabilities of individual observations can be derived or from which samples can be drawn directly. Our genetic algorithm is evaluated under different specified marginal frequencies and correlation structures, and is compared against two existing approaches. The evaluation checks the speed and precision of the approach as well as its suitability for generating also high-dimensional data. In an example of high-throughput glycan array data, we demonstrate the usability of our approach to simulate the power of global test procedures. An implementation of our own and two other methods were added to the R package `RepeatedHighDim'. The presented algorithm is not restricted to certain correlation structures. In contrast to existing methods it is also evaluated for high-dimensional data.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [1] Estimation and Test of Measures of Association for Correlated Binary Data
    El-Sayed, Ahmed M.
    Islam, M. Ataharul
    Alzaid, Abdulhamid A.
    BULLETIN OF THE MALAYSIAN MATHEMATICAL SCIENCES SOCIETY, 2013, 36 (04) : 985 - 1008
  • [2] A simple and effective method for simulating nested exchangeable correlated binary data for longitudinal cluster randomised trials
    Bowden, Rhys A.
    Kasza, Jessica
    Forbes, Andrew B.
    BMC MEDICAL RESEARCH METHODOLOGY, 2024, 24 (01)
  • [3] An efficient MCEM algorithm for fitting generalized linear mixed models for correlated binary data
    Tan, M.
    Tian, G. -L.
    Fang, H. -B.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2007, 77 (11-12) : 929 - 943
  • [4] Tutorial on Biostatistics: Statistical Analysis for Correlated Binary Eye Data
    Ying, Gui-shuang
    Maguire, Maureen G.
    Glynn, Robert
    Rosner, Bernard
    OPHTHALMIC EPIDEMIOLOGY, 2018, 25 (01) : 1 - 12
  • [5] Data Distribution Strategy Research Based on Genetic Algorithm
    Wei, Mingjun
    Xu, Chaochun
    INFORMATION COMPUTING AND APPLICATIONS, PT 1, 2010, 105 : 450 - +
  • [6] Prediction of Atomic Configuration in Binary Nanoparticles by Genetic Algorithm
    Oh, Jung Soo
    Ryou, Wonryong
    Lee, Seung-Cheol
    Choi, Jung-Hae
    JOURNAL OF THE KOREAN CERAMIC SOCIETY, 2011, 48 (06) : 493 - 498
  • [7] A comparison of methods for simulating correlated binary variables with specified marginal means and correlations
    Preisser, John S.
    Qaqish, Bahjat F.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2014, 84 (11) : 2441 - 2452
  • [8] Research and analysis of network data mining based on genetic algorithm
    Shi, Lei
    Zhao, Huiran
    Zhang, Kun
    MATERIAL SCIENCE, CIVIL ENGINEERING AND ARCHITECTURE SCIENCE, MECHANICAL ENGINEERING AND MANUFACTURING TECHNOLOGY II, 2014, 651-653 : 2181 - 2184
  • [9] A multilevel model for spatially correlated binary data in the presence of misclassification: an application in oral health research
    Mutsvari, Timothy
    Bandyopadhyay, Dipankar
    Declerck, Dominique
    Lesaffre, Emmanuel
    STATISTICS IN MEDICINE, 2013, 32 (30) : 5241 - 5259
  • [10] Biomedical Image Registration Using Genetic Algorithm
    Panda, Suraj
    Sarangi, Shubhendu Kumar
    Sarangi, Archana
    INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, 2015, 309 : 289 - 296