Three points to consider when choosing a LM or GLM test for count data

被引:134
|
作者
Warton, David I. [1 ,2 ]
Lyons, Mitchell [3 ]
Stoklosa, Jakub [1 ,2 ]
Ives, Anthony R. [4 ]
机构
[1] Univ New South Wales, Sch Math & Stat, Sydney, NSW 2052, Australia
[2] Univ New South Wales, Evolut & Ecol Res Ctr, Sydney, NSW 2052, Australia
[3] Univ New South Wales, Sch Biol Earth & Environm Sci, Sydney, NSW 2052, Australia
[4] Univ Wisconsin, Dept Zool, Madison, WI 53706 USA
来源
METHODS IN ECOLOGY AND EVOLUTION | 2016年 / 7卷 / 08期
基金
澳大利亚研究理事会; 美国国家科学基金会;
关键词
data transformation; generalized linear models; multivariate analysis; power analysis; type I error; BETA DIVERSITY; COEFFICIENTS;
D O I
10.1111/2041-210X.12552
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
The two most common approaches for analysing count data are to use a generalized linear model (GLM), or transform data, and use a linear model (LM). The latter has recently been advocated to more reliably maintain control of type I error rates in tests for no association, while seemingly losing little in power. We make three points on this issue.Point 1 - Choice of statistical model should primarily be made on the grounds of data properties. Choice of testing procedure should be considered and addressed as a separate issue, after model choice. If models with the appropriate data properties nonetheless have statistical problems such as type I error control (i.e. type I error rate greatly exceeds the intended significance level), the best solution is to keep the model but fix the problems.Point 2 - When a test has problems with type I error control, it can usually be corrected, but this may require departure from software default approaches. In particular, resampling is a good solution for small samples that can be easy to implement.Point 3 -Tests based on models that better fit the data (e.g. a negative binomial for overdispersed count data) tend to have better power properties and in some instances have considerably higher power. We illustrate these issues for a 2x2 experiment with a count response. This seemingly simple problem becomes hard when the experimental design is unbalanced, and software default procedures using LMs or GLMs can have difficulties, although in both cases the issues can be fixed. We conclude that, when GLMs are thought to fit count data well, and when any necessary steps are taken to correct type I error rates, they should be used rather than LMs. Nonetheless, standard LM tests are often robust and can have good type I error control, so there is an argument for their use for counts when diagnostics are difficult and statistical models are complex, although at some risk of loss of power and interpretability.
引用
收藏
页码:882 / 890
页数:9
相关论文
共 15 条
  • [1] 10 POINTS TO CONSIDER WHEN CHOOSING GUARDING
    FLANAGAN, RK
    MANUFACTURING CHEMIST, 1985, 56 (10): : 53 - 54
  • [2] POINTS TO CONSIDER WHEN CHOOSING A BIOPSY METHOD IN CASES OF PLEURISY OF UNKNOWN ORIGIN
    CANTO, A
    RIVAS, J
    SAUMENCH, J
    MORERA, R
    MOYA, J
    CHEST, 1983, 84 (02) : 176 - 179
  • [3] FEATURES TO CONSIDER WHEN CHOOSING A BURN-IN TEST OVEN.
    Scheppe, William
    Electron Manuf, 1988, 34 (03): : 37 - 38
  • [4] Factors to consider when choosing the right ELN for capturing and collaborating with your research data
    Mounteney, Philip
    Lynch, Berkley A.
    Mansley, Tamsin E.
    Phatak, Sharang
    Sager, Jess W.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2014, 247
  • [5] POINTS TO CONSIDER WHEN ANALYSING AND REPORTING COMPARATIVE EFFECTIVENESS RESEARCH WITH OBSERVATIONAL DATA IN RHEUMATOLOGY
    Courvoisier, D.
    Lauper, K.
    Bergstra, S. A.
    De Wit, M.
    Fautrel, B.
    Frisell, T.
    Hyrich, K.
    Iannone, F.
    Kedra, J.
    Machado, P. M.
    Ornbjerg, L. Midtboll
    Rotar, Z.
    Santos, M. J.
    Stamm, T.
    Stones, S.
    Strangfeld, A.
    Landewe, R. B. M.
    Finckh, A.
    ANNALS OF THE RHEUMATIC DISEASES, 2020, 79 : 124 - 125
  • [6] Points to Consider on the Statistical Analysis of Rodent Cancer Bioassay Data When Incorporating Historical Control Data
    Elmore, Susan A.
    Peddada, Shyamal D.
    TOXICOLOGIC PATHOLOGY, 2009, 37 (05) : 672 - 676
  • [7] EULAR points to consider when establishing, analysing and reporting safety data of biologics registers in rheumatology
    Dixon, William G.
    Carmona, Loreto
    Finckh, Axel
    Hetland, Merete Lund
    Kvien, Tore K.
    Landewe, Robert
    Listing, Joachim
    Nicola, Paulo J.
    Tarp, Ulrik
    Zink, Angela
    Askling, Johan
    ANNALS OF THE RHEUMATIC DISEASES, 2010, 69 (09) : 1596 - 1602
  • [8] EULAR points to consider when analysing and reporting comparative effectiveness research using observational data in rheumatology
    Courvoisier, Delphine Sophie
    Lauper, Kim
    Kedra, Joanna
    de Wit, Maarten
    Fautrel, Bruno
    Frisell, Thomas
    Hyrich, Kimme L.
    Iannone, Florenzo
    Machado, Pedro M.
    Ornbjerg, Lykke Midtboll
    Rotar, Ziga
    Santos, Maria Jose
    Stamm, Tanja A.
    Stones, Simon R.
    Strangfeld, Anja
    Bergstra, Sytske Anne
    Landewe, Robert B. M.
    Finckh, Axel
    ANNALS OF THE RHEUMATIC DISEASES, 2022, 81 (06) : 780 - 785
  • [9] Points to consider when evaluating three-dimensional digital subtraction angiography of intracranial aneurysms and their effects on treatment
    Yardimcioglu, Ismail
    Onal, Yilmaz
    Velioglu, Murat
    Karakas, Hakki Muammer
    TURKISH JOURNAL OF MEDICAL SCIENCES, 2021, 51 (03) : 1428 - 1438
  • [10] A SYSTEMATIC REVIEW TO INFORM THE EULAR POINTS TO CONSIDER WHEN ANALYSING AND REPORTING COMPARATIVE EFFECTIVENESS RESEARCH WITH OBSERVATIONAL DATA IN RHEUMATOLOGY
    Lauper, K.
    Kedra, J.
    De Wit, M.
    Fautrel, B.
    Frisell, T.
    Hyrich, K.
    Iannone, F.
    Machado, P. M.
    Ornbjerg, L. Midtboll
    Rotar, Z.
    Santos, M. J.
    Stamm, T.
    Stones, S.
    Strangfeld, A.
    Landewe, R. B. M.
    Finckh, A.
    Bergstra, S. A.
    Courvoisier, D.
    ANNALS OF THE RHEUMATIC DISEASES, 2020, 79 : 123 - 124