Scan statistics with weighted observations

被引:17
作者
Chan, Hock Peng [1 ]
Zhang, Nancy Ruonan
机构
[1] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore 119260, Singapore
[2] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
关键词
change of measure; DNA sequence; importance sampling; large deviation; marked Poisson process; scan statistics;
D O I
10.1198/016214506000001392
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We examine scan statistics for one-dimensional marked Poisson processes. Such statistics tabulate the maximum weighted count of event occurrences within a window of predetermined width over all windows within an observed interval. We derive analytical formulas and also give an importance sampling method for approximating the tail probabilities of scan statistics. Because high-throughput genomic sequencing has led to the availability of massive amounts of biomolecular sequence data, it is often of interest to search long DNA or protein sequences for local regions that are enriched for a certain characteristic. Thus scan statistics have become it useful tool in modern computational biology. We illustrate the application of our p value approximations with such examples.
引用
收藏
页码:595 / 602
页数:8
相关论文
共 25 条
  • [1] Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression
    Blanchette, M
    Bataille, AR
    Chen, XY
    Poitras, C
    Laganière, J
    Lefèbvre, C
    Deblois, G
    Giguère, V
    Ferretti, V
    Bergeron, D
    Coulombe, B
    Robert, FO
    [J]. GENOME RESEARCH, 2006, 16 (05) : 656 - 668
  • [2] Chan HP, 2003, BERNOULLI, V9, P183
  • [3] Chan HP, 2000, ANN STAT, V28, P1638
  • [4] Scoring schemes of palindrome clusters for more sensitive prediction of replication origins in herpesviruses
    Chew, DSH
    Choi, KP
    Leung, MY
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 (15) : 1 - 8
  • [5] THE ASYMPTOTIC-DISTRIBUTION OF THE SCAN STATISTIC UNDER UNIFORMITY
    CRESSIE, N
    [J]. ANNALS OF PROBABILITY, 1980, 8 (04) : 828 - 840
  • [6] FELLER W, 1971, INTRO PROBAB THEOR I, V2
  • [7] FROLOV A, 2005, THEORY PROBABILITY I, V49, P531
  • [8] GLAZ J, 1989, J AM STAT ASSOC, V84, P560
  • [9] GLAZ J, 1991, ANN APPLIED PROBABIL, V1, P306, DOI DOI 10.1214/aoap/1177005940
  • [10] CHANCE AND STATISTICAL SIGNIFICANCE IN PROTEIN AND DNA-SEQUENCE ANALYSIS
    KARLIN, S
    BRENDEL, V
    [J]. SCIENCE, 1992, 257 (5066) : 39 - 49