Border Sampling Through Coupling Markov Chain Monte Carlo

被引:1
|
作者
Li, Guichong [1 ]
Japkowicz, Nathalie [1 ]
Stocki, Trevor J. [2 ]
Ungar, R. Kurt [2 ]
机构
[1] Comp Sci Univ Ottawa, Ottawa, ON, Canada
[2] Health Canada, Radiat Protect Bureau, Ottawa, ON, Canada
来源
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2008年
关键词
D O I
10.1109/ICDM.2008.52
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, Progressive Border Sampling (PBS) was proposed for sample selection in supervised learning by progressively learning an augmented full border from small labeled datasets. However, this quadratic learning algorithm is inapplicable to large datasets. In this paper, we incorporate the PBS to a state of the art technique called Coupling Markov Chain Monte Carlo (CMCMC) in an attempt to scale the original algorithm up, on large labeled datasets. The CMCMC can produce an exact sample while a naive strategy for Markov Chain Monte Carlo cannot guarantee the convergence to a stationary distribution. The resulting CMCMC PBS algorithm is thus proposed for border sampling on large datasets. CMCMC-PBS exhibits several remarkable characteristics: linear time complexity, learner-independence, and a consistent convergence to an optimal sample from the original training sets by learning from their subsamples. Our experimental results on the 33 either small or large labeled datasets from the UCIKDD repository and a nuclear security application show that our new approach outperforms many previous sampling techniques for sample selection.
引用
收藏
页码:393 / +
页数:3
相关论文
共 50 条
  • [21] Probability-Based Structural Health Monitoring Through Markov Chain Monte Carlo Sampling
    Li, P. J.
    Xu, D. W.
    Zhang, J.
    INTERNATIONAL JOURNAL OF STRUCTURAL STABILITY AND DYNAMICS, 2016, 16 (07)
  • [22] ENHANCED MIXTURE POPULATION MONTE CARLO VIA STOCHASTIC OPTIMIZATION AND MARKOV CHAIN MONTE CARLO SAMPLING
    El-Laham, Yousef
    Djuric, Petar M.
    Bugallo, Monica F.
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 5475 - 5479
  • [23] Markov Chain Monte Carlo
    Henry, Ronnie
    EMERGING INFECTIOUS DISEASES, 2019, 25 (12) : 2298 - 2298
  • [24] Sampling from complicated and unknown distributions Monte Carlo and Markov Chain Monte Carlo methods for redistricting
    Cho, Wendy K. Tam
    Liu, Yan Y.
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2018, 506 : 170 - 178
  • [25] Bayesian penalized empirical likelihood and Markov Chain Monte Carlo sampling
    Chang, Jinyuan
    Tang, Cheng Yong
    Zhu, Yuanzheng
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2025,
  • [26] Cool walking: A new Markov chain Monte Carlo sampling method
    Brown, S
    Head-Gordon, T
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2003, 24 (01) : 68 - 76
  • [27] Automated Parameter Blocking for Efficient Markov Chain Monte Carlo Sampling
    Turek, Daniel
    de Valpine, Perry
    Paciorek, Christopher J.
    Anderson-Bergman, Clifford
    BAYESIAN ANALYSIS, 2017, 12 (02): : 465 - 490
  • [28] On input selection with reversible jump Markov chain Monte Carlo sampling
    Sykacek, P
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 638 - 644
  • [29] Accelerated Markov chain Monte Carlo sampling in electrical capacitance tomography
    Watzenig, Daniel
    Neumayer, Markus
    Fox, Colin
    COMPEL-THE INTERNATIONAL JOURNAL FOR COMPUTATION AND MATHEMATICS IN ELECTRICAL AND ELECTRONIC ENGINEERING, 2011, 30 (06) : 1842 - 1854
  • [30] A mixture representation of π with applications in Markov chain Monte Carlo and perfect sampling
    Hobert, JP
    Robert, CP
    ANNALS OF APPLIED PROBABILITY, 2004, 14 (03): : 1295 - 1305