Mining user privacy concern topics from app reviews

被引:0
作者
Zhang, Jianzhang [1 ]
Zhou, Jialong [1 ]
Hua, Jinping [2 ]
Niu, Nan [3 ]
Liu, Chuang [1 ]
机构
[1] Hangzhou Normal Univ, Dept Management Sci & Engn, Hangzhou, Zhejiang, Peoples R China
[2] Jiangxi Prov Inst Cyber Secur, Nanchang, Jiangxi, Peoples R China
[3] Univ Cincinnati, Dept Elect Engn & Comp Sci, Cincinnati, OH 45221 USA
基金
中国国家自然科学基金;
关键词
Privacy concerns; Topic modeling; App reviews mining; Privacy requirements; Requirements engineering; MOBILE APPS; REQUIREMENTS; PERCEPTION; TAXONOMY;
D O I
10.1016/j.jss.2025.112355
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: As mobile applications (apps) widely spread throughout our society and daily life, various personal information is constantly demanded by apps in exchange for more intelligent and customized functionality. An increasing number of users are voicing their privacy concerns through app reviews on app stores. Objective: The main challenge of effectively mining privacy concerns from user reviews lies in that reviews expressing privacy concerns are overridden by a large number of reviews expressing more generic themes and noisy content. In this work, we propose a novel automated approach to overcome that challenge. Method: Our approach first employs information retrieval and document embeddings to extract candidate privacy reviews in an unsupervised manner, which are further labeled to prepare the annotation dataset. Then, supervised classifiers are trained to automatically identify privacy reviews. Finally, an interpretable topic mining algorithm is designed to detect privacy concern topics contained in the privacy reviews. Results: Experimental results show that the best performing document embedding achieves an average precision of 96.80% in the top 100 retrieved candidate privacy reviews, outperforming the taxonomy-based baseline, which achieves 73.87%. All trained privacy review classifiers achieve an F1 score above 91%, surpassing the keyword-matching baseline by as much as 7.5% and the large language model baseline by up to 2.74%. For detecting privacy concern topics from privacy reviews, our proposed algorithm achieves both better topic coherence and topic diversity than three strong topic modeling baselines, including LDA. Conclusion: Empirical evaluation results demonstrate the effectiveness of our approach in identifying privacy reviews and detecting user privacy concerns in app reviews.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Mining user requirements to facilitate mobile app quality upgrades with big data
    Chen, Runyu
    Wang, Qili
    Xu, Wei
    ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2019, 38
  • [22] Mining domain knowledge from app descriptions
    Liu, Yuzhou
    Liu, Lei
    Liu, Huaxiao
    Wang, Xiaoyu
    Yang, Hongji
    JOURNAL OF SYSTEMS AND SOFTWARE, 2017, 133 : 126 - 144
  • [23] Interest Mining from User Tweets
    Thuy Vu
    Perez, Victor
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1869 - 1872
  • [24] Text Mining-Based Analysis of Content Topics and User Engagement in University Social Media
    Soloviev, Mark
    Aksenov, Pavel
    Skhvediani, Angi
    Tenishev, Timur
    Kolomenskiy, Fedor
    Bormontova, Elena
    IEEE ACCESS, 2024, 12 : 150354 - 150371
  • [25] Analyzing User Reviews on Digital Detox Apps: A Text Mining and Sentiment Analysis Approach
    Khan, Nazar Fatima
    Khan, Mohammed Naved
    JOURNAL OF CONSUMER BEHAVIOUR, 2025, 24 (01) : 392 - 404
  • [26] User requirements of mobile technology: results from a content analysis of user reviews
    Judith Gebauer
    Ya Tang
    Chaiwat Baimai
    Information Systems and e-Business Management, 2008, 6 : 361 - 384
  • [27] Towards Extracting Coherent User Concerns and their Hierarchical Organization from User Reviews
    Pradhan, Ligaj
    Zhang, Chengcui
    Bethard, Steven
    PROCEEDINGS OF 2016 IEEE 17TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI), 2016, : 582 - 590
  • [28] Simulating the Impact of Annotation Guidelines and Annotated Data on Extracting App Features from App Reviews
    Shah, Faiz Ali
    Sirts, Kairit
    Pfahl, Dietmar
    ICSOFT: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES, 2019, : 384 - 396
  • [29] User requirements of mobile technology: results from a content analysis of user reviews
    Gebauer, Judith
    Tang, Ya
    Baimai, Chaiwat
    INFORMATION SYSTEMS AND E-BUSINESS MANAGEMENT, 2008, 6 (04) : 361 - 384
  • [30] Mining Privacy Goals from Privacy Policies Using Hybridized Task Recomposition
    Bhatia, Jaspreet
    Breaux, Travis D.
    Schaub, Florian
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2016, 25 (03)