The App Sampling Problem for App Store Mining

被引:75
作者
Martin, William [1 ]
Harman, Mark [1 ]
Jia, Yue [1 ]
Sarro, Federica [1 ]
Zhang, Yuanyuan [1 ]
机构
[1] UCL, Dept Comp Sci, London, England
来源
12TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2015) | 2015年
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/MSR.2015.19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many papers on App Store Mining are susceptible to the App Sampling Problem, which exists when only a subset of apps are studied, resulting in potential sampling bias. We introduce the App Sampling Problem, and study its effects on sets of user review data. We investigate the effects of sampling bias, and techniques for its amelioration in App Store Mining and Analysis, where sampling bias is often unavoidable. We mine 106,891 requests from 2,729,103 user reviews and investigate the properties of apps and reviews from 3 different partitions: the sets with fully complete review data, partially complete review data, and no review data at all. We find that app metrics such as price, rating, and download rank are significantly different between the three completeness levels. We show that correlation analysis can find trends in the data that prevail across the partitions, offering one possible approach to App Store Analysis in the presence of sampling bias.
引用
收藏
页码:123 / 133
页数:11
相关论文
共 36 条
[1]  
Andrew Kachites, MALLET MACHINE LEARN
[2]  
[Anonymous], 2012, AUSTR COMP HUM INT C
[3]  
[Anonymous], 2013, 7 INT AAAI C WEBL SO
[4]  
[Anonymous], 2014, 11 WORKING C MINING
[5]  
Avdiienko V., 2015, 2015 INT C SOFTW ENG
[6]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]  
Carreño LVG, 2013, PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), P582, DOI 10.1109/ICSE.2013.6606604
[9]   AR-Miner: Mining Informative Reviews for Developers from Mobile App Marketplace [J].
Chen, Ning ;
Lin, Jialiu ;
Hoi, Steven C. H. ;
Xiao, Xiaokui ;
Zhang, Boshen .
36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, :767-778
[10]  
Fu B, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P1276