Mining user privacy concern topics from app reviews

被引:0
作者
Zhang, Jianzhang [1 ]
Zhou, Jialong [1 ]
Hua, Jinping [2 ]
Niu, Nan [3 ]
Liu, Chuang [1 ]
机构
[1] Hangzhou Normal Univ, Dept Management Sci & Engn, Hangzhou, Zhejiang, Peoples R China
[2] Jiangxi Prov Inst Cyber Secur, Nanchang, Jiangxi, Peoples R China
[3] Univ Cincinnati, Dept Elect Engn & Comp Sci, Cincinnati, OH 45221 USA
基金
中国国家自然科学基金;
关键词
Privacy concerns; Topic modeling; App reviews mining; Privacy requirements; Requirements engineering; MOBILE APPS; REQUIREMENTS; PERCEPTION; TAXONOMY;
D O I
10.1016/j.jss.2025.112355
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: As mobile applications (apps) widely spread throughout our society and daily life, various personal information is constantly demanded by apps in exchange for more intelligent and customized functionality. An increasing number of users are voicing their privacy concerns through app reviews on app stores. Objective: The main challenge of effectively mining privacy concerns from user reviews lies in that reviews expressing privacy concerns are overridden by a large number of reviews expressing more generic themes and noisy content. In this work, we propose a novel automated approach to overcome that challenge. Method: Our approach first employs information retrieval and document embeddings to extract candidate privacy reviews in an unsupervised manner, which are further labeled to prepare the annotation dataset. Then, supervised classifiers are trained to automatically identify privacy reviews. Finally, an interpretable topic mining algorithm is designed to detect privacy concern topics contained in the privacy reviews. Results: Experimental results show that the best performing document embedding achieves an average precision of 96.80% in the top 100 retrieved candidate privacy reviews, outperforming the taxonomy-based baseline, which achieves 73.87%. All trained privacy review classifiers achieve an F1 score above 91%, surpassing the keyword-matching baseline by as much as 7.5% and the large language model baseline by up to 2.74%. For detecting privacy concern topics from privacy reviews, our proposed algorithm achieves both better topic coherence and topic diversity than three strong topic modeling baselines, including LDA. Conclusion: Empirical evaluation results demonstrate the effectiveness of our approach in identifying privacy reviews and detecting user privacy concerns in app reviews.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] App store mining for iterative domain analysis: Combine app descriptions with user reviews
    Liu, Yuzhou
    Liu, Lei
    Liu, Huaxiao
    Yin, Xinglong
    SOFTWARE-PRACTICE & EXPERIENCE, 2019, 49 (06) : 1013 - 1040
  • [2] A systematic literature review: Opinion mining studies from mobile app store user reviews
    Genc-Nayebi, Necmiye
    Abran, Alain
    JOURNAL OF SYSTEMS AND SOFTWARE, 2017, 125 : 207 - 219
  • [3] Analyzing User Perspectives on Mobile App Privacy at Scale
    Nema, Preksha
    Anthonysamy, Pauline
    Taft, Nina
    Peddinti, Sai Teja
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 112 - 124
  • [4] Analysis of COVID-19 Gov PK app user reviews to determine online privacy concerns of Pakistani citizens
    Yaqub, Ussama
    Saleem, Tauqeer
    Zaman, Salma
    GLOBAL KNOWLEDGE MEMORY AND COMMUNICATION, 2024, 73 (6/7) : 913 - 928
  • [5] Personalized Mobile App Recommendation: Reconciling App Functionality and User Privacy Preference
    Liu, Bin
    Kong, Deguang
    Cen, Lei
    Gong, Neil Zhenqiang
    Jin, Hongxia
    Xiong, Hui
    WSDM'15: PROCEEDINGS OF THE EIGHTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2015, : 315 - 324
  • [6] Identifying Functional and Non-functional Software Requirements From User App Reviews
    Dave, Dev
    Anu, Vaibhav
    2022 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2022, : 845 - 850
  • [7] Mining the Influencing Factors and Their Asymmetrical Effects of mHealth Sleep App User Satisfaction From Real-world User-Generated Reviews: Content Analysis and Topic Modeling
    Nuo, Mingfu
    Zheng, Shaojiang
    Wen, Qinglian
    Fang, Hongjuan
    Wang, Tong
    Liang, Jun
    Han, Hongbin
    Lei, Jianbo
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [8] Unveiling Competition Dynamics in Mobile App Markets Through User Reviews
    Motger, Quim
    Franch, Xavier
    Gervasi, Vincenzo
    Marco, Jordi
    REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY, REFSQ 2024, 2024, 14588 : 251 - 266
  • [9] User Experience of Cognitive Behavioral Therapy Apps for Depression: An Analysis of App Functionality and User Reviews
    Stawarz, Katarzyna
    Preist, Chris
    Tallon, Debbie
    Wiles, Nicola
    Coyle, David
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2018, 20 (06)
  • [10] Opinion mining for app reviews: an analysis of textual representation and predictive models
    Adailton F. Araujo
    Marcos P. S. Gôlo
    Ricardo M. Marcacini
    Automated Software Engineering, 2022, 29