Autonomous Cross Domain Adaptation Under Extreme Label Scarcity

被引:6
作者
Weng, Weiwei [1 ]
Pratama, Mahardhika [2 ,3 ]
Zain, Choiru [4 ]
de Carvalho, Marcus [1 ]
Appan, Rakaraddi [1 ]
Ashfahani, Andri [1 ]
Yee, Edward Yapp Kien [5 ]
机构
[1] Nanyang Technol Univ NTU, Sch Comp Sci & Engn SCSE, Jurong 639798, Singapore
[2] Nanyang Technol Univ NTU, Sch Comp Sci & Engn SCSE, Adelaide, SA 5095, Australia
[3] Univ South Australia, Acad Unit STEM, Adelaide, SA 5095, Australia
[4] Monash Univ, Sch Informat Technol IT, Clayton, Vic 3800, Australia
[5] ASTAR, Singapore Inst Mfg Technol, Singapore 138634, Singapore
基金
新加坡国家研究基金会;
关键词
Labeling; Transfer learning; Costs; Feature extraction; Data mining; Australia; Task analysis; Concept drifts; data streams; incremental learning; multistream classification; transfer learning;
D O I
10.1109/TNNLS.2022.3183356
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A cross domain multistream classification is a challenging problem calling for fast domain adaptations to handle different but related streams in never-ending and rapidly changing environments. Notwithstanding that existing multistream classifiers assume no labeled samples in the target stream, they still incur expensive labeling costs since they require fully labeled samples of the source stream. This article aims to attack the problem of extreme label shortage in the cross domain multistream classification problems where only very few labeled samples of the source stream are provided before process runs. Our solution, namely, Learning Streaming Process from Partial Ground Truth (LEOPARD), is built upon a flexible deep clustering network where its hidden nodes, layers, and clusters are added and removed dynamically with respect to varying data distributions. A deep clustering strategy is underpinned by a simultaneous feature learning and clustering technique leading to clustering-friendly latent spaces. A domain adaptation strategy relies on the adversarial domain adaptation technique where a feature extractor is trained to fool a domain classifier by classifying source and target streams. Our numerical study demonstrates the efficacy of LEOPARD where it delivers improved performances compared to prominent algorithms in 15 of 24 cases. Source codes of LEOPARD are shared in https://github.com/wengweng001/LEOPARD.git to enable further study.
引用
收藏
页码:6839 / 6850
页数:12
相关论文
共 36 条
[1]  
[Anonymous], 2019, P EMNLP
[2]   Unsupervised Continual Learning in Streaming Environments [J].
Ashfahani, Andri ;
Pratama, Mahardhika .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (12) :9992-10003
[3]  
Ben-David S., 2006, Advances in Neural Information Processing Systems, DOI DOI 10.7551/MITPRESS/7503.003.0022
[4]   A theory of learning from different domains [J].
Ben-David, Shai ;
Blitzer, John ;
Crammer, Koby ;
Kulesza, Alex ;
Pereira, Fernando ;
Vaughan, Jennifer Wortman .
MACHINE LEARNING, 2010, 79 (1-2) :151-175
[5]   Open Set Domain Adaptation [J].
Busto, Pau Panareda ;
Gall, Juergen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :754-763
[6]   An Adaptive Framework for Multistream Classification [J].
Chandra, Swarup ;
Hague, Ahsanul ;
Khan, Latifur ;
Aggarwal, Charu .
CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, :1181-1190
[7]  
Chi H., 2021, P ADV NEUR INF PROC
[8]  
de Carvalho M., 2021, ARXIV211001326
[9]   Multistream Classification with Relative Density Ratio Estimation [J].
Dong, Bo ;
Gao, Yang ;
Chandra, Swarup ;
Khan, Latifur .
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, :3478-3485
[10]   MARLINE: Multi-Source Mapping Transfer Learning for Non-Stationary Environments [J].
Du, Honghui ;
Minku, Leandro L. ;
Zhou, Huiyu .
20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, :122-131