Novel sampling design for respondent-driven sampling

被引:10
作者
Khabbazian, Mohammad [1 ]
Hanlon, Bret [2 ]
Russek, Zoe [3 ]
Rohe, Karl [3 ]
机构
[1] Univ Wisconsin, Dept Elect Engn, 1415 Engn Dr, Madison, WI 53706 USA
[2] Univ Wisconsin, Dept Surg, Sch Med & Publ Hlth, 600 Highland Ave, Madison, WI 53792 USA
[3] Univ Wisconsin, Dept Stat, 1300 Univ Ave, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
Hard-to-reach population; respondent-driven sampling; social network; Markov chain; stochastic Blockmodels; anti-cluster RDS; SNOWBALL;
D O I
10.1214/17-EJS1358
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Respondent-driven sampling (RDS) is a method of chain referral sampling popular for sampling hidden and/or marginalized populations. As such, even under the ideal sampling assumptions, the performance of RDS is restricted by the underlying social network: if the network is divided into communities that are weakly connected to each other, then RDS is likely to oversample one of these communities. In order to diminish the "referral bottlenecks" between communities, we propose anti-cluster RDS (AC-RDS),an adjustment to the standard RDS implementation. Using a standard model in the RDS literature, namely, a Markov process on the social network that is indexed by a tree, we construct and study the Markov transition matrix for AC-RDS. We show that if the underlying network is generated from the Stochastic Blockmodel with equal block sizes, then the transition matrix for AC-RDS has a larger spectral gap and consequently faster mixing properties than the standard random walk model for RDS. In addition, we show that AC-RDS reduces the covariance of the samples in the referral tree compared to the standard RDS and consequently leads to a smaller variance and design effect. We confirm the effectiveness of the new design using both the Add-Health networks and simulated networks.
引用
收藏
页码:4769 / 4812
页数:44
相关论文
共 45 条
[1]  
[Anonymous], 1990, Matrix perturbation theory, Computer Science and Scientific Computing
[2]  
[Anonymous], 2009, AM MATH SOC PROVIDEN
[3]  
Arayasirikul S., 2015, JMIR PUBLIC HLTH SUR, V1
[4]   Estimating uncertainty in respondent-driven sampling using a tree bootstrap method [J].
Baraff, Aaron J. ;
McCormick, Tyler H. ;
Raftery, Adrian E. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (51) :14668-14673
[5]   Examples comparing importance sampling and the metropolis algorithm [J].
Bassetti, Federico ;
Diaconis, Persi .
ILLINOIS JOURNAL OF MATHEMATICS, 2006, 50 (01) :67-91
[6]   MARKOV-CHAINS INDEXED BY TREES [J].
BENJAMINI, I ;
PERES, Y .
ANNALS OF PROBABILITY, 1994, 22 (01) :219-243
[7]  
Centers for Disease Control and Prevention, 2012, NAT HIV BEH SURV SYS
[8]  
Chung F. R. K., 1997, SPECTRAL GRAPH THEOR, V92
[9]  
Chung F, 2011, ELECTRON J COMB, V18
[10]  
Crawford F. W., 2017, AM J EPIDEMIOLOGY