Mining user privacy concern topics from app reviews

被引:0
作者
Zhang, Jianzhang [1 ]
Zhou, Jialong [1 ]
Hua, Jinping [2 ]
Niu, Nan [3 ]
Liu, Chuang [1 ]
机构
[1] Hangzhou Normal Univ, Dept Management Sci & Engn, Hangzhou, Zhejiang, Peoples R China
[2] Jiangxi Prov Inst Cyber Secur, Nanchang, Jiangxi, Peoples R China
[3] Univ Cincinnati, Dept Elect Engn & Comp Sci, Cincinnati, OH 45221 USA
基金
中国国家自然科学基金;
关键词
Privacy concerns; Topic modeling; App reviews mining; Privacy requirements; Requirements engineering; MOBILE APPS; REQUIREMENTS; PERCEPTION; TAXONOMY;
D O I
10.1016/j.jss.2025.112355
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: As mobile applications (apps) widely spread throughout our society and daily life, various personal information is constantly demanded by apps in exchange for more intelligent and customized functionality. An increasing number of users are voicing their privacy concerns through app reviews on app stores. Objective: The main challenge of effectively mining privacy concerns from user reviews lies in that reviews expressing privacy concerns are overridden by a large number of reviews expressing more generic themes and noisy content. In this work, we propose a novel automated approach to overcome that challenge. Method: Our approach first employs information retrieval and document embeddings to extract candidate privacy reviews in an unsupervised manner, which are further labeled to prepare the annotation dataset. Then, supervised classifiers are trained to automatically identify privacy reviews. Finally, an interpretable topic mining algorithm is designed to detect privacy concern topics contained in the privacy reviews. Results: Experimental results show that the best performing document embedding achieves an average precision of 96.80% in the top 100 retrieved candidate privacy reviews, outperforming the taxonomy-based baseline, which achieves 73.87%. All trained privacy review classifiers achieve an F1 score above 91%, surpassing the keyword-matching baseline by as much as 7.5% and the large language model baseline by up to 2.74%. For detecting privacy concern topics from privacy reviews, our proposed algorithm achieves both better topic coherence and topic diversity than three strong topic modeling baselines, including LDA. Conclusion: Empirical evaluation results demonstrate the effectiveness of our approach in identifying privacy reviews and detecting user privacy concerns in app reviews.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Extracting Hierarchy of Coherent User-Concerns to Discover Intricate User Behavior from User Reviews
    Pradhan, Ligaj
    Zhang, Chengcui
    Bethard, Steven
    INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2016, 7 (04) : 63 - 80
  • [32] Mining Requirements Arguments From User Forums
    Khan, Javed Ali
    2019 27TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE 2019), 2019, : 440 - 445
  • [33] Survival strategies for family-run homestays: Analyzing user reviews through text mining
    Krishnan, Jay
    Bhattacharjee, Biplab
    Pratap, Maheshwar
    Yadav, Janardan Krishna
    Maiti, Moinak
    Data Science and Management, 2024, 7 (03): : 228 - 237
  • [34] A Study of App User Behaviours: Transitions from Freemium to Premium
    Mulligan, Christopher
    Cruz, Carlito Vera
    Healy, Donagh
    Murphy, David
    Hall, Margeret
    Nelson, Quinn
    Caton, Simon
    HCI IN BUSINESS, GOVERNMENT, AND ORGANIZATIONS, 2018, 10923 : 396 - 412
  • [35] Mining user reviews of COVID contact-tracing apps: An exploratory analysis of nine European apps
    Garousi, Vahid
    Cutting, David
    Felderer, Michael
    JOURNAL OF SYSTEMS AND SOFTWARE, 2022, 184
  • [36] A Phrase-Level User Requests Mining Approach in Mobile Application Reviews: Concept, Framework, and Operation
    Yang, Cheng
    Wu, Lingang
    Yu, Chunyang
    Zhou, Yuliang
    INFORMATION, 2021, 12 (05)
  • [37] Unfolding Sentimental and Behavioral Tendencies of Learners' Concerned Topics From Course Reviews in a MOOC
    Liu, Sannyuya
    Peng, Xian
    Cheng, Hercy N. H.
    Liu, Zhi
    Sun, Jianwen
    Yang, Chongyang
    JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2019, 57 (03) : 670 - 696
  • [38] Sentiment Analysis for Requirements Elicitation from App Reviews: A Systematic Mapping Study
    Wan, Hongyan
    An, Zhiquan
    Wang, Bangchao
    Xiong, Teng
    PROCEEDINGS OF THE 2023 30TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC 2023, 2023, : 201 - 210
  • [39] Voting with Their Feet: Inferring User Preferences from App Management Activities
    Li, Huoran
    Ai, Wei
    Liu, Xuanzhe
    Tang, Jian
    Huang, Gang
    Feng, Feng
    Mei, Qiaozhu
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16), 2016, : 1351 - 1361
  • [40] Emerging App Issue Identification from User Feedback: Experience on WeChat
    Gao, Cuiyun
    Zheng, Wujie
    Deng, Yuetang
    Lo, David
    Zeng, Jichuan
    Lyu, Michael R.
    King, Irwin
    2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), 2019, : 279 - 288