Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?

被引：79

作者：

Chen, Jieshan ^{[1
]}

Xie, Mulong ^{[1
]}

Xing, Zhenchang ^{[1
,3
]}

Chen, Chunyang ^{[2
]}

Xu, Xiwei ^{[3
]}

Zhu, Liming ^{[3
,5
]}

Li, Guoqiang ^{[4
]}

机构：

[1] Australian Natl Univ, Canberra, ACT, Australia

[2] Monash Univ, Clayton, Vic, Australia

[3] CSIRO, Data61, Canberra, ACT, Australia

[4] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

[5] Univ New South Wales, Kensington, NSW, Australia

来源：

PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20) | 2020年

关键词：

Object Detection; User Interface; Deep Learning; Computer Vision; IMAGES;

D O I：

10.1145/3368089.3409691

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Detecting Graphical User Interface (GUI) elements in GUI images is a domain-specific object detection task. It supports many software engineering tasks, such as GUI animation and testing, GUI search and code generation. Existing studies for GUI element detection directly borrow the mature methods from computer vision (CV) domain, including old fashioned ones that rely on traditional image processing features (e.g., canny edge, contours), and deep learning models that learn to detect from large-scale GUI data. Unfortunately, these CV methods are not originally designed with the awareness of the unique characteristics of GUIs and GUI elements and the high localization accuracy of the GUI element detection task. We conduct the first large-scale empirical study of seven representative GUI element detection methods on over 50k GUI images to understand the capabilities, limitations and effective designs of these methods. This study not only sheds the light on the technical challenges to be addressed but also informs the design of new GUI element detection methods. We accordingly design a new GUI-specific old-fashioned method for non-text GUI element detection which adopts a novel top-down coarse-to-fine strategy, and incorporate it with the mature deep learning model for GUI text detection. Our evaluation on 25,000 GUI images shows that our method significantly advances the start-of-the-art performance in GUI element detection.

引用

页码：1202 / 1214

页数：13

共 52 条

[1] [Anonymous], 2011, P 24 ANN ACM S USER, DOI 10
[2] [Anonymous], 2018, TSE 18
[3] [Anonymous], 2016, Applied Computer Science
[4] Banovic N, 2012, UIST'12: PROCEEDINGS OF THE 25TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, P83
[5] ActivitySpace: A Remembrance Framework to Support Interapplication Information Needs
Bao, Lingfeng
Ye, Deheng
Xing, Zhenchang
Xia, Xin
Wang, Xinyu
[J]. 2015 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2015, : 864 - 869
[6] scvRipper: Video Scraping Tool for Modeling Developers' Behavior Using Interaction Data
Bao, Lingfeng
Li, Jing
Xing, Zhenchang
Wang, Xinyu
Zhou, Bo
[J]. 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, 2015, : 673 - 676
[7] Bernal-Cardenas Carlos, 2020, 42 INT C SOFTW ENG I
[8] Robust Relational Layout Synthesis from Examples for Android
Bielik, Pavol
Fischer, Marc
Vechev, Martin
[J]. PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2018, 2
[9] Bridge Karl, 2018, WINDOWS ACCESSIBILIT
[10] A COMPUTATIONAL APPROACH TO EDGE-DETECTION
CANNY, J
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1986, 8 (06) : 679 - 698

← 1 2 3 4 5 6 →