Controlling the false-discovery rate by procedures adapted to the length bias of RNA-Seq

被引:0
|
作者
Tae Young Yang
Seongmun Jeong
机构
[1] Myongji University,Department of Mathematics
[2] Division of Strategic Research Groups Korea Research Institute of Bioscience and Biotechnology,Personalized Genomic Medicine Research Center
[3] Myongji University,Department of Mathematics
来源
Journal of the Korean Statistical Society | 2018年 / 47卷
关键词
primary 62P10; secondary 92B05; Common weight; Individual weight; Length bias; RNA-Seq; Separate procedure; Weighted procedure;
D O I
暂无
中图分类号
学科分类号
摘要
In RNA-Seq experiments, the number of mapped reads for a given gene is proportional to its expression level and length. Because longer genes contribute more sequencible fragments than do shorter ones, it is expected that even if two genes have the same expression level, the longer gene will have a greater number of total reads. This characteristic creates a length bias such that the proportion of significant genes increases with the gene length. However, genes with a long length are not more biologically meaningful than genes with a short length. Therefore, the length bias should be properly corrected to determine the accurate list of significant genes in RNA-Seq. For this purpose, we proposed two multiple-testing procedures based on a weighted-FDR and a separate-FDR approach. These two methods use prior information on differential gene length while keeping the false-discovery rate (FDR) controlled at a. In the weighted-FDR controlling procedure, we incorporated prior weights for the length of each gene. These weights increase the power when the gene’s length is short and decrease the power when its length is long. In the separate-FDR controlling procedure, we sequentially ordered all genes according to their lengths and then split these genes into two subgroups of short and long genes. The adaptive Benjamini-Hochberg procedure was then performed separately for each subgroup. The proposed procedures were compared with existing methods and evaluated in two numerical examples and one simulation study. We concluded that the weighted p-value procedure properly reduced the length bias of RNA-Seq.
引用
收藏
页码:13 / 23
页数:10
相关论文
共 50 条
  • [1] Controlling the false-discovery rate by procedures adapted to the length bias of RNA-Seq
    Yang, Tae Young
    Jeong, Seongmun
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2018, 47 (01) : 13 - 23
  • [2] Grouped False-Discovery Rate for Removing the Gene-Set-Level Bias of RNA-seq
    Yang, Tae Young
    Jeong, Seongmun
    EVOLUTIONARY BIOINFORMATICS, 2013, 9 : 467 - 478
  • [3] Controlling the false-discovery rate in astrophysical data analysis
    Miller, CJ
    Genovese, C
    Nichol, RC
    Wasserman, L
    Connolly, A
    Reichart, D
    Hopkins, A
    Schneider, J
    ASTRONOMICAL JOURNAL, 2001, 122 (06): : 3492 - 3505
  • [4] Gene set analysis controlling for length bias in RNA-seq experiments
    Xing Ren
    Qiang Hu
    Song Liu
    Jianmin Wang
    Jeffrey C. Miecznikowski
    BioData Mining, 10
  • [5] Gene set analysis controlling for length bias in RNA-seq experiments
    Ren, Xing
    Hu, Qiang
    Liu, Song
    Wang, Jianmin
    Miecznikowski, Jeffrey C.
    BIODATA MINING, 2017, 10 : 1 - 18
  • [6] Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
    Zehetmayer, Sonja
    Posch, Martin
    Graf, Alexandra
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [7] Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
    Sonja Zehetmayer
    Martin Posch
    Alexandra Graf
    BMC Bioinformatics, 23
  • [8] Can the false-discovery rate be misleading?
    Barboza, Rodrigo
    Cociorva, Daniel
    Xu, Tao
    Barbosa, Valmir C.
    Perales, Jonas
    Valente, Richard H.
    Franca, Felipe M. G.
    Yates, John R., III
    Carvalho, Paulo C.
    PROTEOMICS, 2011, 11 (20) : 4105 - 4108
  • [9] Controlling the false-discovery rate when identifying the subgroup benefiting from treatment
    Schnell, Patrick M.
    CLINICAL TRIALS, 2023, 20 (04) : 394 - 404
  • [10] Length bias correction for RNA-seq data in gene set analyses
    Gao, Liyan
    Fang, Zhide
    Zhang, Kui
    Zhi, Degui
    Cui, Xiangqin
    BIOINFORMATICS, 2011, 27 (05) : 662 - 669