EMPIRICAL ANALYSIS ON OPENAPI TOPIC EXPLORATION AND DISCOVERY TO SUPPORT THE DEVELOPER COMMUNITY

被引：2

作者：

da Rocha Araujo, Leonardo ^{[1
]}

Rodriguez, Guillermo ^{[1
]}

Vidal, Santiago ^{[1
]}

Marcos, Claudia ^{[2
]}

dos Santos, Rodrigo Pereira ^{[3
]}

机构：

[1] UNICEN, ISISTAN CONICET, Tandil, Argentina

[2] UNICEN, ISISTAN CIC, Tandil, Argentina

[3] Univ Fed Estado Rio de Janeiro, Rio de Janeiro, Brazil

来源：

COMPUTING AND INFORMATICS | 2021年 / 40卷 / 06期

关键词：

RESTful web services; APIs; documentation; topic modeling; OpenAPI; topic coherence;

D O I：

10.31577/cai_2021_6_1345

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

OpenAPI has become a dominant standard for documentation in the service-oriented software industry. OpenAPI is used in many analysis and reengineering approaches for RESTful service and microservice-based systems. An OpenAPI document has several components that are usually filled by humans using natural language (e.g. description of a certain functionality). Thus, subjectivity may lead to inconsistencies and ambiguities. Understanding what an API does is a challenging question. As a consequence, this issue could hinder developers from identifying the functionality of APIs, after reading all its components. Along this line, we argue that developers will be provided with supportive tools to find those APIs that better suit their needs. In this paper, we propose a step towards creating these kinds of tools by empirically analyzing a set of 2 000 OpenAPI documents with the goal of extracting the main topics of an API using three topic modeling algorithms. To address this issue, we focus on three tasks: i) determine which component of an OpenAPI document provides the most meaningful information, ii) compare three state-of-the-art topic modeling algorithms, and iii) determine the optimal number of topics to represent an API. Our findings show that the best results could be obtained from the Description component by using the Non-negative Matrix Factorization (NMF) or Latent Semantic Indexing (LSI) algorithms. To help developers find services in the OpenAPI directory, we also propose a prototype tool to explore the OpenAPI documents and analyze extracted topics to assess if the APIs meet developers needs.

引用

页码：1345 / 1369

页数：25

共 27 条

[1]

Alghamdi R, 2015, INT J ADV COMPUT SC, V6, P147

[2]

Alhindawi N, 2016, INT J ADV COMPUT SC, V7, P200

[3]

Nguyen AT, 2012, IEEE INT CONF AUTOM, P70, DOI 10.1145/2351676.2351687

[4] Microservices Identification Through Interface Analysis [J].

Baresi, Luciano ;

Garriga, Martin ;

De Renzis, Alan .

SERVICE-ORIENTED AND CLOUD COMPUTING (ESOCC 2017), 2017, 10465 :19-33

[5] Collecting Service-Based Maintainability Metrics from RESTful API Descriptions: Static Analysis and Threshold Derivation [J].

Bogner, Justus ;

Wagner, Stefan ;

Zimmermann, Alfred .

SOFTWARE ARCHITECTURE, ECSA 2020 TRACKS AND WORKSHOPS, 2020, 1269 :215-227

[6]

Chang J., 2009, Adv Neural Inf Process Syst, P288

[7]

Feldman R., 2007, TEXT MINING HDB ADV, DOI DOI 10.1017/CBO9780511546914

[8] Using heuristics to estimate an appropriate number of latent topics in source code analysis [J].

Grant, Scott ;

Cordy, James R. ;

Skillicorn, David B. .

SCIENCE OF COMPUTER PROGRAMMING, 2013, 78 (09) :1663-1678

[9]

Kamath SS, 2016, J WEB ENG, V15, P110

[10] The Exploitation of OpenAPI Documentation for the Generation of Web Frontends [J].

Koren, Istvan ;

Klamma, Ralf .

COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, :781-787

← 1 2 3 →