Border Sampling Through Coupling Markov Chain Monte Carlo

被引:1
作者
Li, Guichong [1 ]
Japkowicz, Nathalie [1 ]
Stocki, Trevor J. [2 ]
Ungar, R. Kurt [2 ]
机构
[1] Comp Sci Univ Ottawa, Ottawa, ON, Canada
[2] Health Canada, Radiat Protect Bureau, Ottawa, ON, Canada
来源
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2008年
关键词
D O I
10.1109/ICDM.2008.52
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, Progressive Border Sampling (PBS) was proposed for sample selection in supervised learning by progressively learning an augmented full border from small labeled datasets. However, this quadratic learning algorithm is inapplicable to large datasets. In this paper, we incorporate the PBS to a state of the art technique called Coupling Markov Chain Monte Carlo (CMCMC) in an attempt to scale the original algorithm up, on large labeled datasets. The CMCMC can produce an exact sample while a naive strategy for Markov Chain Monte Carlo cannot guarantee the convergence to a stationary distribution. The resulting CMCMC PBS algorithm is thus proposed for border sampling on large datasets. CMCMC-PBS exhibits several remarkable characteristics: linear time complexity, learner-independence, and a consistent convergence to an optimal sample from the original training sets by learning from their subsamples. Our experimental results on the 33 either small or large labeled datasets from the UCIKDD repository and a nuclear security application show that our new approach outperforms many previous sampling techniques for sample selection.
引用
收藏
页码:393 / +
页数:3
相关论文
共 50 条
  • [41] ASYMPTOTICALLY INDEPENDENT MARKOV SAMPLING: A NEW MARKOV CHAIN MONTE CARLO SCHEME FOR BAYESIAN INFERENCE
    Beck, James L.
    Zuev, Konstantin M.
    INTERNATIONAL JOURNAL FOR UNCERTAINTY QUANTIFICATION, 2013, 3 (05) : 445 - 474
  • [42] Ensemble Bayesian model averaging using Markov Chain Monte Carlo sampling
    Jasper A. Vrugt
    Cees G. H. Diks
    Martyn P. Clark
    Environmental Fluid Mechanics, 2008, 8 : 579 - 595
  • [43] Markov chain Monte Carlo sampling based terahertz holography image denoising
    Chen, Guanghao
    Li, Qi
    APPLIED OPTICS, 2015, 54 (14) : 4345 - 4351
  • [44] Generalized poststratification and importance sampling for subsampled Markov chain Monte Carlo estimation
    Guha, Subharup
    MacEachern, Steven N.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (475) : 1175 - 1184
  • [45] Estimating the Earthquake Source Time Function by Markov Chain Monte Carlo Sampling
    Wojciech Dȩbski
    Pure and Applied Geophysics, 2008, 165 : 1263 - 1287
  • [46] Sample caching Markov chain Monte Carlo approach to boson sampling simulation
    Liu, Yong
    Xiong, Min
    Wu, Chunqing
    Wang, Dongyang
    Liu, Yingwen
    Ding, Jiangfang
    Huang, Anqi
    Fu, Xiang
    Qiang, Xiaogang
    Xu, Ping
    Deng, Mingtang
    Yang, Xuejun
    Wu, Junjie
    NEW JOURNAL OF PHYSICS, 2020, 22 (03):
  • [47] Markov Chain Monte Carlo Sampling for Target Analysis of Transient Absorption Spectra
    Ashner, Matthew N.
    Winslow, Samuel W.
    Swan, James W.
    Tisdale, William A.
    JOURNAL OF PHYSICAL CHEMISTRY A, 2019, 123 (17) : 3893 - 3902
  • [48] Robust particle tracker via Markov Chain Monte Carlo posterior sampling
    Wang, Fasheng
    Lu, Mingyu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 72 (01) : 573 - 589
  • [49] Simulation of Markov Chain Monte Carlo Boson Sampling Based on Photon Losses
    Huang Xun
    Ni Ming
    Ji Yang
    Wu Yongzheng
    LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (21)
  • [50] Potential-Decomposition Strategy in Markov Chain Monte Carlo Sampling Algorithms
    Shangguan Dan-Hua
    Bao Jing-Dong
    COMMUNICATIONS IN THEORETICAL PHYSICS, 2010, 54 (05) : 854 - 856