Mining user privacy concern topics from app reviews

被引:0
作者
Zhang, Jianzhang [1 ]
Zhou, Jialong [1 ]
Hua, Jinping [2 ]
Niu, Nan [3 ]
Liu, Chuang [1 ]
机构
[1] Hangzhou Normal Univ, Dept Management Sci & Engn, Hangzhou, Zhejiang, Peoples R China
[2] Jiangxi Prov Inst Cyber Secur, Nanchang, Jiangxi, Peoples R China
[3] Univ Cincinnati, Dept Elect Engn & Comp Sci, Cincinnati, OH 45221 USA
基金
中国国家自然科学基金;
关键词
Privacy concerns; Topic modeling; App reviews mining; Privacy requirements; Requirements engineering; MOBILE APPS; REQUIREMENTS; PERCEPTION; TAXONOMY;
D O I
10.1016/j.jss.2025.112355
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: As mobile applications (apps) widely spread throughout our society and daily life, various personal information is constantly demanded by apps in exchange for more intelligent and customized functionality. An increasing number of users are voicing their privacy concerns through app reviews on app stores. Objective: The main challenge of effectively mining privacy concerns from user reviews lies in that reviews expressing privacy concerns are overridden by a large number of reviews expressing more generic themes and noisy content. In this work, we propose a novel automated approach to overcome that challenge. Method: Our approach first employs information retrieval and document embeddings to extract candidate privacy reviews in an unsupervised manner, which are further labeled to prepare the annotation dataset. Then, supervised classifiers are trained to automatically identify privacy reviews. Finally, an interpretable topic mining algorithm is designed to detect privacy concern topics contained in the privacy reviews. Results: Experimental results show that the best performing document embedding achieves an average precision of 96.80% in the top 100 retrieved candidate privacy reviews, outperforming the taxonomy-based baseline, which achieves 73.87%. All trained privacy review classifiers achieve an F1 score above 91%, surpassing the keyword-matching baseline by as much as 7.5% and the large language model baseline by up to 2.74%. For detecting privacy concern topics from privacy reviews, our proposed algorithm achieves both better topic coherence and topic diversity than three strong topic modeling baselines, including LDA. Conclusion: Empirical evaluation results demonstrate the effectiveness of our approach in identifying privacy reviews and detecting user privacy concerns in app reviews.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Mining User Reviews for Key Design Features in Cognitive Behavioral Therapy-Based Mobile Mental Health Apps
    El-Gayar, Omar
    Al-Ramahi, Mohammad
    Wahbeh, Abdullah
    Elnoshokaty, Ahmed
    Nasralah, Tareq
    TELEMEDICINE AND E-HEALTH, 2025, 31 (03) : 333 - 343
  • [42] Method for Predicting Mobile Service Evolution from User Reviews and Update Logs
    Song, Jiafei
    Wang, Zhongjie
    Tu, Zhiying
    Xu, Xiaofei
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (10) : 1551 - 1586
  • [43] Exploring antecedents impacting user satisfaction with voice assistant app: A text mining-based analysis on Alexa services
    Kumar, Anand
    Bala, Pradip Kumar
    Chakraborty, Shibashish
    Behera, Rajat Kumar
    JOURNAL OF RETAILING AND CONSUMER SERVICES, 2024, 76
  • [44] Modeling of Evolutionary Game between SNS and User: From the Perspective of Privacy Concerns
    Wu Lian-ren
    Chen Xia
    2014 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING (ICMSE), 2014, : 115 - 119
  • [45] Feature Recommendation by Mining Updates and User Feedback from Competitor Apps
    Uddin, Md Kafil
    He, Qiang
    Han, Jun
    Chua, Caslon
    PROCEEDINGS OF THE 17TH EAI INTERNATIONAL CONFERENCE ON MOBILE AND UBIQUITOUS SYSTEMS: COMPUTING, NETWORKING AND SERVICES (MOBIQUITOUS 2020), 2021, : 18 - 28
  • [46] Exploring convolutional neural networks and topic models for user profiling from drug reviews
    Tutubalina, Elena
    Nikolenko, Sergey
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (04) : 4791 - 4809
  • [47] Exploring convolutional neural networks and topic models for user profiling from drug reviews
    Elena Tutubalina
    Sergey Nikolenko
    Multimedia Tools and Applications, 2018, 77 : 4791 - 4809
  • [48] From user-generated data to data-driven innovation: A research agenda to understand user privacy in digital markets
    Saura, Jose Ramon
    Ribeiro-Soriano, Domingo
    Palacios-Marques, Daniel
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2021, 60
  • [49] Mining Insights From Esports Game Reviews With an Aspect-Based Sentiment Analysis Framework
    Yu, Yang
    Dinh, Duy-Tai
    Nguyen, Ba-Hung
    Yu, Fangyu
    Huynh, Van-Nam
    IEEE ACCESS, 2023, 11 : 61161 - 61172
  • [50] From Anxiety to Contentment: The Role of Multiple Mediations and Privacy Concerns in the Transition from the FOMO to the JOMO Among Dating App Users
    Li, Yuanhao
    Han, Eunkyoung
    BEHAVIORAL SCIENCES, 2025, 15 (02)