Towards Non-IID image classification: A dataset and baselines

被引:100
作者
He, Yue [1 ]
Shen, Zheyan [1 ]
Cui, Peng [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Lab Media & Network, Room 9-316,East Main Bldg, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Non-IID; Dataset; Context; Bias; ConvNet; Batch balancing; PROPENSITY SCORE;
D O I
10.1016/j.patcog.2020.107383
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
I.I.D.(2) hypothesis between training and testing data is the basis of numerous image classification meth-ods. Such property can hardly be guaranteed in practice where the Non-IIDness is common, causing instable performances of these models. In literature, however, the Non-I.I.D.(3) image classification problem is largely understudied. A key reason is lacking of a well-designed dataset to support related research. In this paper, we construct and release a Non-I.I.D. image dataset called NICO4, which uses contexts to create Non-IIDness consciously. Compared to other datasets, extended analyses prove NICO can support various Non-I.I.D. situations with sufficient flexibility. Meanwhile, we propose a baseline model with ConvNet structure for General Non-I.I.D. image classification, where distribution of testing data is unknown but different from training data. The experimental results demonstrate that NICO can well support the training of ConvNet model from scratch, and a batch balancing module can help ConvNets to perform better in Non-I.I.D. settings. (c) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 45 条
[1]  
[Anonymous], ICML
[2]  
[Anonymous], 2009, ICIVR
[3]  
[Anonymous], 2017, P INT C MACH LEARN
[4]   A Tutorial and Case Study in Propensity Score Analysis: An Application to Estimating the Effect of In-Hospital Smoking Cessation Counseling on Mortality [J].
Austin, Peter C. .
MULTIVARIATE BEHAVIORAL RESEARCH, 2011, 46 (01) :119-151
[5]   Doubly robust estimation in missing data and causal inference models [J].
Bang, H .
BIOMETRICS, 2005, 61 (04) :962-972
[6]   Power-Law Distributions in Empirical Data [J].
Clauset, Aaron ;
Shalizi, Cosma Rohilla ;
Newman, M. E. J. .
SIAM REVIEW, 2009, 51 (04) :661-703
[7]   Active Transfer Learning Network: A Unified Deep Joint Spectral-Spatial Feature Learning Model for Hyperspectral Image Classification [J].
Deng, Cheng ;
Xue, Yumeng ;
Liu, Xianglong ;
Li, Chao ;
Tao, Dacheng .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (03) :1741-1754
[8]   Active multi-kernel domain adaptation for hyperspectral image classification [J].
Deng, Cheng ;
Liu, Xianglong ;
Li, Chao ;
Tao, Dacheng .
PATTERN RECOGNITION, 2018, 77 :306-315
[9]  
Deng J., 2009, Construction and Analysis of a Large Scale Image Ontology
[10]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136