Domain adaptation and sample bias correction theory and algorithm for regression

被引:88
作者
Cortes, Corinna [1 ]
Mohri, Mehryar [1 ,2 ]
机构
[1] Google Res, New York, NY 10011 USA
[2] NYU, Courant Inst Math Sci, New York, NY 10012 USA
关键词
Machine learning; Learning theory; Domain adaptation; Optimization; COVARIATE SHIFT;
D O I
10.1016/j.tcs.2013.09.027
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a series of new theoretical, algorithmic, and empirical results for domain adaptation and sample bias correction in regression. We prove that the discrepancy is a distance for the squared loss when the hypothesis set is the reproducing kernel Hilbert space induced by a universal kernel such as the Gaussian kernel. We give new pointwise loss guarantees based on the discrepancy of the empirical source and target distributions for the general class of kernel-based regularization algorithms. These bounds have a simpler form than previous results and hold for a broader class of convex loss functions not necessarily differentiable, including L-q losses and the hinge loss. We also give finer bounds based on the discrepancy and a weighted feature discrepancy parameter. We extend the discrepancy minimization adaptation algorithm to the more significant case where kernels are used and show that the problem can be cast as an SDP similar to the one in the feature space. We also show that techniques from smooth optimization can be used to derive an efficient algorithm for solving such SDPs even for very high-dimensional feature spaces and large samples. We have implemented this algorithm and report the results of experiments both with artificial and real-world data sets demonstrating its benefits both for general scenario of adaptation and the more specific scenario of sample bias correction. Our results show that it can scale to large data sets of tens of thousands or more points and demonstrate its performance improvement benefits. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 126
页数:24
相关论文
共 37 条
  • [1] [Anonymous], 2010, NIPS
  • [2] [Anonymous], 1989, REAL ANAL PROBABILIT
  • [3] [Anonymous], 2007, P ASS COMP LING ACL
  • [4] [Anonymous], 2006, Advances in neural information processing systems
  • [5] [Anonymous], ADV NEURAL INFORM PR
  • [6] [Anonymous], 2007, P NIPS
  • [7] [Anonymous], 1995, J Convex Anal
  • [8] [Anonymous], 2003, INTRO LECT CONVEX OP
  • [9] [Anonymous], ICML
  • [10] Bickel S, 2009, J MACH LEARN RES, V10, P2137