Bayesian Markov chain Monte Carlo imputation for the transiting exoplanets with an application in clustering analysis

被引:1
作者
Teng, Huei-Wen [1 ]
Hung, Wen-Liang [2 ]
Chao, Yen-Ju [3 ]
机构
[1] Natl Cent Univ, Grad Inst Stat, Jhongli, Taiwan
[2] Natl Hsinchu Univ Educ, Dept Appl Math, Hsinchu, Taiwan
[3] Cathy United Bank, Taipei, Taiwan
关键词
missing data; transiting exoplanets; hot Jupiters; copula; Metropolis-Hastings algorithm; PERIODOGRAM FINDS EVIDENCE; EXTRASOLAR PLANETS; MAXIMUM-LIKELIHOOD; MODEL; MASS;
D O I
10.1080/02664763.2014.995609
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
To impute the missing values of mass in the transiting exoplanet data, this paper uses the Frank copula to combine two Pareto marginal distributions. Next, a Bayesian Markov chain Monte Carlo (MCMC) imputation method is proposed. The proposed Bayesian MCMC imputation method is found to outperform the mean imputation method. Clustering analysis can shed light on the formation and evolution of exoplanets. After imputing the missing values of mass in the transiting exoplanet data using the proposed approach, the similarity-based clustering method (SCM) clustering algorithm is applied to the logarithm of mass and period for this complete data set. The SCM clustering result indicates two clusters. Furthermore, the intracluster Spearman rank-order correlation coefficients [GRAPHICS] for mass and period in these two clusters are 0.401 and [GRAPHICS] , respectively, at a significance level of 0.01. This result illustrates that the mass and period correlate in an opposite way between the two different clusters. It implies that the formation and evolution processes of these two clusters are different.
引用
收藏
页码:1120 / 1132
页数:13
相关论文
共 23 条
[1]  
[Anonymous], 1959, ANN LISUP
[2]   Using data augmentation via the Gibbs Sampler to incorporate missing covariate structure in linear models for ecological assessments [J].
Boone, Edward L. ;
Ye, Keying ;
Smith, Eric P. .
ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2009, 16 (01) :75-87
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]  
Gelman A., 1992, STAT SCI, V7, P457, DOI [DOI 10.1214/SS/1177011136, 10.1214/ss/1177011136]
[5]  
GENEST C, 1987, BIOMETRIKA, V74, P549, DOI 10.1093/biomet/74.3.549
[6]  
Gilksm W. R., 1996, MARKOV CHAIN MONTE C
[7]   A Bayesian periodogram finds evidence for three planets in HD 11964 [J].
Gregory, P. C. .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2007, 381 (04) :1607-1616
[8]   A Bayesian Kepler periodogram detects a second planet in HD 208487 [J].
Gregory, P. C. .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2007, 374 (04) :1321-1333
[9]   A Bayesian analysis of extrasolar planet data for HD 73526 [J].
Gregory, PC .
ASTROPHYSICAL JOURNAL, 2005, 631 (02) :1198-1214
[10]   Bayesian exoplanet tests of a new method for MCMC sampling in highly correlated model parameter spaces [J].
Gregory, Philip C. .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2011, 410 (01) :94-110