Hierarchical classification of data streams: a systematic literature review

被引:9
|
作者
Tieppo, Eduardo [1 ,2 ]
dos Santos, Roger Robson [2 ]
Barddal, Jean Paul [2 ]
Nievola, Julio Cesar [2 ]
机构
[1] Inst Fed Parana IFPR, Campus Pinhais, Pinhais, Brazil
[2] Pontificia Univ Catolica Parana PUCPR, Posgrad Informat PPGIa, Curitiba, Parana, Brazil
关键词
Data stream mining; Hierarchical classification; Systematic literature review; Machine learning; ACTIVITY RECOGNITION; OBJECT RECOGNITION; CLASSIFIERS; MACHINE; REPRESENTATION; PERFORMANCE; ALGORITHM; AGREEMENT; QUALITY; DRIFT;
D O I
10.1007/s10462-021-10087-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classification task usually works with flat and batch learners, assuming problems as stationary and without relations between class labels. Nevertheless, several real-world problems do not assume these premises, i.e., data have labels organized hierarchically and are made available in streaming fashion, meaning that their behavior can drift over time. Existing studies on hierarchical classification do not consider data streams as input of their process, and thus, data is assumed as stationary and handled through batch learners. The same can be said about works on streaming data, as the hierarchical classification is overlooked. Studies concerning each area individually are promising, yet, do not tackle their intersection. This study analyzes the main characteristics of the state-of-the-art works on hierarchical classification for streaming data concerning five aspects: (i) problems tackled, (ii) datasets, (iii) algorithms, (iv) evaluation metrics, and (v) research gaps in the area. We performed a systematic literature review of primary studies and retrieved 3,722 papers, of which 42 were identified as relevant and used to answer the aforementioned research questions. We found that the problems handled by hierarchical classification of data streams include mainly classification of images, human activities, texts, and audio; the datasets are mostly created or synthetic data; the algorithms and evaluation metrics are well-known techniques or based on those; and research gaps are related to dynamic context, data complexity, and computational resources constraints. We also provide implications for future research and experiments to consider common characteristics shared amongst hierarchical classification and data stream classification.
引用
收藏
页码:3243 / 3282
页数:40
相关论文
共 50 条
  • [21] Systematic Literature Review on Data-Driven Models for Predictive Maintenance of Railway Track: Implications in Geotechnical Engineering
    Xie, Jiawei
    Huang, Jinsong
    Zeng, Cheng
    Jiang, Shui-Hua
    Podlich, Nathan
    GEOSCIENCES, 2020, 10 (11) : 1 - 24
  • [22] Blockchain for Cybersecurity: Systematic Literature Review and Classification
    Liu, Marina
    Yeoh, William
    Jiang, Frank
    Choo, Kim-Kwang Raymond
    JOURNAL OF COMPUTER INFORMATION SYSTEMS, 2022, 62 (06) : 1182 - 1198
  • [23] Data Capital: A Systematic Literature Review
    Ramadhan, Arief
    DESIDOC JOURNAL OF LIBRARY & INFORMATION TECHNOLOGY, 2022, 42 (02): : 119 - 129
  • [24] Data Journalism: A Systematic Literature Review
    Erkmen, Ozlem
    JOURNALISM STUDIES, 2024, 25 (01) : 58 - 79
  • [25] 3D object recognition and classification: a systematic literature review
    Carvalho, L. E.
    von Wangenheim, A.
    PATTERN ANALYSIS AND APPLICATIONS, 2019, 22 (04) : 1243 - 1292
  • [26] Machine/Deep Learning for Software Engineering: A Systematic Literature Review
    Wang, Simin
    Huang, Liguo
    Gao, Amiao
    Ge, Jidong
    Zhang, Tengfei
    Feng, Haitao
    Satyarth, Ishna
    Li, Ming
    Zhang, He
    Ng, Vincent
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (03) : 1188 - 1231
  • [27] Positive deviance, big data, and development: A systematic literature review
    Albanna, Basma
    Heeks, Richard
    ELECTRONIC JOURNAL OF INFORMATION SYSTEMS IN DEVELOPING COUNTRIES, 2019, 85 (01):
  • [28] Integration of Semantics Into Sensor Data for the IoT: A Systematic Literature Review
    Sejdiu, Besmir
    Ismaili, Florije
    Ahmedi, Lule
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2020, 16 (04) : 1 - 25
  • [29] Big Data in Food: Systematic Literature Review and Future Directions
    Chakraborty, Debarun
    Rana, Nripendra P.
    Khorana, Sangeeta
    Singu, Hari Babu
    Luthra, Sunil
    JOURNAL OF COMPUTER INFORMATION SYSTEMS, 2023, 63 (05) : 1243 - 1263
  • [30] Data governance in healthcare information systems: A systematic literature review
    Ngesimani, Nomputumo L.
    Ruhode, Ephias
    Harpur, Patricia-Ann
    SOUTH AFRICAN JOURNAL OF INFORMATION MANAGEMENT, 2022, 24 (01):