FACTOR AND CLUSTER ANALYSIS IN CONTEXT OF BANK'S PROPENSITY SCORE MATCHING

被引:0
作者
Sirota, Sergej [1 ]
Rezankova, Hana [1 ]
机构
[1] Univ Econ, Dept Stat & Probabil, Sq W Churchill 1938-4, Prague 13067 3, Czech Republic
来源
12TH INTERNATIONAL DAYS OF STATISTICS AND ECONOMICS | 2018年
关键词
factor analysis; cluster analysis; propensity score matching; silhouette coefficient; success rate;
D O I
暂无
中图分类号
F [经济];
学科分类号
02 ;
摘要
Propensity score matching is a statistical matching technique, where the probability of an event occurrence is estimated by the score, which is mostly calculated using logistic regression (propensity model). It is classification task with a known number of groups. The aim of this contribution is to improve propensity model by using factor and cluster analyses. Due to cluster analysis and logistic regression reason, we apply a method of the significant variables selection based on the correlation comparison between explanatory variables and the target variable. We also use factor analysis with the Varimax rotation to create new variables to reduce the data set and include these factor variables in the process of matching. After the data set reduction, we apply the k-means and TwoStep clustering methods to add new information about the structure of selected objects for propensity score matching. The cluster quality (objects clustering) is compared by the silhouette coefficient. The effect of factor and cluster analyses application is measured by the total success rate of the created propensity model.
引用
收藏
页码:1625 / 1634
页数:10
相关论文
共 12 条
  • [1] [Anonymous], 2011, DATA MINING TECHNIQU
  • [2] The specification of the propensity score in multilevel observational studies
    Arpino, Bruno
    Mealli, Fabrizia
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (04) : 1770 - 1780
  • [3] The performance of estimators based on the propensity score
    Huber, Martin
    Lechner, Michael
    Wunsch, Conny
    [J]. JOURNAL OF ECONOMETRICS, 2013, 175 (01) : 1 - 21
  • [4] Kitikidou K., 2013, INNOVA CIENCIA, V5, P2
  • [5] Propensity score methods for estimating relative risks in cluster randomized trials with low-incidence binary outcomes and selection bias
    Leyrat, Clemence
    Caille, Agnes
    Donner, Allan
    Giraudeau, Bruno
    [J]. STATISTICS IN MEDICINE, 2014, 33 (20) : 3556 - 3575
  • [6] Propensity score weighting with multilevel data
    Li, Fan
    Zaslavsky, Alan M.
    Landrum, Mary Beth
    [J]. STATISTICS IN MEDICINE, 2013, 32 (19) : 3373 - 3387
  • [7] Cluster-Based Logistic Regression Model for Holiday Travel Mode Choice
    Li, Juan
    Weng, Jinxian
    Shao, Chunfu
    Guo, Hongwei
    [J]. GREEN INTELLIGENT TRANSPORTATION SYSTEM AND SAFETY, 2016, 138 : 729 - 737
  • [8] Manly B., 2016, MULTIVARIATE STAT ME
  • [9] Rummel R. J., 1988, APPL FACTOR ANAL
  • [10] Sirota S, 2018, SCI SEM DOCT STUD FI, P122