propr: An R-package for Identifying Proportionally Abundant Features Using Compositional Data Analysis

被引:119
作者
Quinn, Thomas P. [1 ,2 ]
Richardson, Mark F. [1 ,3 ]
Lovell, David [4 ]
Crowley, Tamsyn M. [1 ,2 ]
机构
[1] Deakin Univ, Bioinformat Core Res Grp, Geelong, Vic, Australia
[2] Deakin Univ, Ctr Mol & Med Res, Geelong, Vic, Australia
[3] Deakin Univ, Ctr Integrat Ecol, Geelong, Vic, Australia
[4] Queensland Univ Technol, Brisbane, Qld, Australia
来源
SCIENTIFIC REPORTS | 2017年 / 7卷
关键词
RNA-SEQ;
D O I
10.1038/s41598-017-16520-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In the life sciences, many assays measure only the relative abundances of components in each sample. Such data, called compositional data, require special treatment to avoid misleading conclusions. Awareness of the need for caution in analyzing compositional data is growing, including the understanding that correlation is not appropriate for relative data. Recently, researchers have proposed proportionality as a valid alternative to correlation for calculating pairwise association in relative data. Although the question of how to best measure proportionality remains open, we present here a computationally efficient R package that implements three measures of proportionality. In an effort to advance the understanding and application of proportionality analysis, we review the mathematics behind proportionality, demonstrate its application to genomic data, and discuss some ongoing challenges in the analysis of relative abundance data.
引用
收藏
页数:9
相关论文
共 17 条
[1]  
Aitchison J., 1986, Monographs on Statistics and Applied Probability, DOI [10.1007/978-94-009-4109-0, DOI 10.1007/978-94-009-4109-0]
[2]  
Martín-Fernández JA, 2011, COMPOSITIONAL DATA ANALYSIS: THEORY AND APPLICATIONS, P43
[3]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[4]  
Eddelbuettel D, 2011, J STAT SOFTW, V40, P1
[5]  
Erb I., 2016, THEORY BIOSCIENCES
[6]   Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis [J].
Fernandes, Andrew D. ;
Reid, Jennifer N. S. ;
Macklaim, Jean M. ;
McMurrough, Thomas A. ;
Edgell, David R. ;
Gloor, Gregory B. .
MICROBIOME, 2014, 2
[7]   Inferring Correlation Networks from Genomic Survey Data [J].
Friedman, Jonathan ;
Alm, Eric J. .
PLOS COMPUTATIONAL BIOLOGY, 2012, 8 (09)
[8]   Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data [J].
Gloor, Gregory B. ;
Reid, Gregor .
CANADIAN JOURNAL OF MICROBIOLOGY, 2016, 62 (08) :692-703
[9]   Synthetic spike-in standards for RNA-seq experiments [J].
Jiang, Lichun ;
Schlesinger, Felix ;
Davis, Carrie A. ;
Zhang, Yu ;
Li, Renhua ;
Salit, Marc ;
Gingeras, Thomas R. ;
Oliver, Brian .
GENOME RESEARCH, 2011, 21 (09) :1543-1551
[10]   Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster [J].
Lin, Yanzhu ;
Golovnina, Kseniya ;
Chen, Zhen-Xia ;
Lee, Hang Noh ;
Negron, Yazmin L. Serrano ;
Sultana, Hina ;
Oliver, Brian ;
Harbison, Susan T. .
BMC GENOMICS, 2016, 17