Detecting relevant app reviews for software evolution and maintenance through multimodal one-class learning

被引:4
作者
Golo, Marcos P. S. [1 ]
Araujo, Adailton F. [1 ]
Rossi, Rafael G. [2 ]
Marcacini, Ricardo M. [1 ]
机构
[1] Univ Sao Paulo, Inst Math & Comp Sci, POB 668, BR-13560970 Sao Carlos, SP, Brazil
[2] FACOM Fed Univ Mato Grosso do Sul, BR-79070900 Campo Grande, MS, Brazil
基金
巴西圣保罗研究基金会;
关键词
One-class classification; App reviews classification; Multimodal Autoencoders;
D O I
10.1016/j.infsof.2022.106998
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Mobile app reviews are a rich source of information for software evolution and maintenance. Several studies have shown the effectiveness of exploring relevant reviews in the software development lifecycle, such as release planning and requirements engineering tasks. Popular apps receive even millions of reviews, thereby making manual extraction of relevant information an impractical task. The literature presents several machine learning approaches to detect relevant reviews. However, these approaches use multi-class learning, implying more user effort for data labeling since users must label a significant set of relevant and irrelevant reviews. Objective: This article investigates methods for detecting relevant app reviews considering scenarios with small sets of labeled data. We evaluated unimodal and multimodal representations, different labeling levels, as well as different app review domains and languages. Method: We present a one-class multimodal learning method for detecting relevant reviews. Our approaches have two main contributions. First, we use one-class learning that requires only the labeling of relevant app reviews, thereby minimizing the labeling effort. Second, to handle the smaller amount of labeled reviews without harming classification performance, we also present methods to improve feature extraction and reviews representation. We propose the Multimodal Autoencoder and the Multimodal Variational Autoencoder. The methods learn representations which explore both textual data and visual information based on the density of the reviews. Density information can be interpreted as a summary of the main topics or clusters extracted from the reviews. Results: Our methods achieved competitive results even using only 25% of labeled reviews compared to models that used the entire training set. Also, our multimodal approaches obtain the highest F-1-Score and AUC-ROC in twenty-three out of twenty-four scenarios. Conclusion: Our one-class multimodal methods proved to be a competitive alternative for detecting relevant reviews and promising for practical scenarios involving data-driven software evolution and maintenance.
引用
收藏
页数:12
相关论文
共 51 条
[31]   A Survey of the Usages of Deep Learning for Natural Language Processing [J].
Otter, Daniel W. ;
Medina, Julian R. ;
Kalita, Jugal K. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) :604-624
[32]   Requirements-Collector: Automating Requirements Specification from Elicitation Sessions and User Feedback [J].
Panichella, Sebastiano ;
Ruiz, Marcela .
2020 28TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE'20), 2020, :404-407
[33]   ARdoc: App Reviews Development Oriented Classifier [J].
Panichella, Sebastiano ;
Di Sorbo, Andrea ;
Guzman, Emitza ;
Visaggio, Corrado A. ;
Canfora, Gerardo ;
Gall, Harald .
FSE'16: PROCEEDINGS OF THE 2016 24TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2016, :1023-1027
[34]  
Panichella S, 2015, PROC IEEE INT CONF S, P281, DOI 10.1109/ICSM.2015.7332474
[35]   "Why Should I Trust You?" Explaining the Predictions of Any Classifier [J].
Ribeiro, Marco Tulio ;
Singh, Sameer ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :1135-1144
[36]   Optimization and label propagation in bipartite heterogeneous networks to improve transductive classification of texts [J].
Rossi, Rafael Geraldeli ;
Lopes, Alneu de Andrade ;
Rezende, Solange Oliveira .
INFORMATION PROCESSING & MANAGEMENT, 2016, 52 (02) :217-257
[37]   SILHOUETTES - A GRAPHICAL AID TO THE INTERPRETATION AND VALIDATION OF CLUSTER-ANALYSIS [J].
ROUSSEEUW, PJ .
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1987, 20 :53-65
[38]   Two-Phase Multimodal Network for App Categorization using APK Resources [J].
Rungta, Mukund ;
Sherki, Praneet Prabhakar ;
Dhaliwal, Mehak Preet ;
Tiwari, Hemant ;
Vala, Vanraj .
2020 IEEE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2020), 2020, :162-165
[39]   Learning over subconcepts: Strategies for 1-class classification [J].
Sharma, Shiven ;
Somayaji, Anil ;
Japkowicz, Nathalie .
COMPUTATIONAL INTELLIGENCE, 2018, 34 (02) :440-467
[40]   Classifying Multilingual User Feedback using Traditional Machine Learning and Deep Learning [J].
Stanik, Christoph ;
Haering, Marlo ;
Maalej, Walid .
2019 IEEE 27TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW 2019), 2019, :220-226