Analysis of Learning from Positive and Unlabeled Data
被引:0
|
作者:
du Plessis, Marthinus C.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Tokyo, Tokyo 1130033, JapanUniv Tokyo, Tokyo 1130033, Japan
du Plessis, Marthinus C.
[1
]
Niu, Gang
论文数: 0引用数: 0
h-index: 0
机构:
Baidu Inc, Beijing 100085, Peoples R ChinaUniv Tokyo, Tokyo 1130033, Japan
Niu, Gang
[2
]
论文数: 引用数:
h-index:
机构:
Sugiyama, Masashi
[1
]
机构:
[1] Univ Tokyo, Tokyo 1130033, Japan
[2] Baidu Inc, Beijing 100085, Peoples R China
来源:
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014)
|
2014年
/
27卷
关键词:
ALGORITHM;
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Learning a classifier from positive and unlabeled data is an important class of classification problems that are conceivable in many practical applications. In this paper, we first show that this problem can be solved by cost-sensitive learning between positive and unlabeled data. We then show that convex surrogate loss functions such as the hinge loss may lead to a wrong classification boundary due to an intrinsic bias, but the problem can be avoided by using non-convex loss functions such as the ramp loss. We next analyze the excess risk when the class prior is estimated from data, and show that the classification accuracy is not sensitive to class prior estimation if the unlabeled data is dominated by the positive data (this is naturally satisfied in inlier-based outlier detection because inliers are dominant in the unlabeled dataset). Finally, we provide generalization error bounds and show that, for an equal number of labeled and unlabeled samples, the generalization error of learning only from positive and unlabeled samples is no worse than 2 root 2 times the fully supervised case. These theoretical findings are also validated through experiments.
机构:
Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R China
Hebei Univ, Coll Math & Informat Sci, Baoding 071002, Peoples R ChinaBeijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R China
Yang, Liu
Jing, Liping
论文数: 0引用数: 0
h-index: 0
机构:
Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R ChinaBeijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R China
Jing, Liping
Yu, Jian
论文数: 0引用数: 0
h-index: 0
机构:
Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R ChinaBeijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R China
Yu, Jian
Ng, Michael K.
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Baptist Univ, Dept Math, Ctr Math Imaging & Vis, Hong Kong, Hong Kong, Peoples R ChinaBeijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R China
机构:
Stanford Univ, Sch Med, Stanford Cardiovasc Inst, Stanford, CA 94305 USA
Stanford Univ, Sch Med, Dept Genet, Stanford, CA 94305 USA
Greenstone Biosci, Palo Alto, CA 94304 USA
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Sch Med, Stanford Cardiovasc Inst, Stanford, CA 94305 USA
Zhang, Angela
Xing, Lei
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Sch Med, Dept Radiat Oncol, Stanford, CA USAStanford Univ, Sch Med, Stanford Cardiovasc Inst, Stanford, CA 94305 USA
Xing, Lei
论文数: 引用数:
h-index:
机构:
Zou, James
Wu, Joseph C.
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Sch Med, Stanford Cardiovasc Inst, Stanford, CA 94305 USA
Greenstone Biosci, Palo Alto, CA 94304 USA
Stanford Univ, Dept Med, Div Cardiovasc Med, Stanford, CA 94305 USA
Stanford Univ, Sch Med, Dept Radiol, Stanford, CA 94305 USAStanford Univ, Sch Med, Stanford Cardiovasc Inst, Stanford, CA 94305 USA
机构:
Tsinghua Univ, Dept Math Sci, Beijing 100084, Peoples R China
Univ Wisconsin, Dept Stat, Madison, WI 53706 USATsinghua Univ, Dept Math Sci, Beijing 100084, Peoples R China
Liang, Muxuan
Li, Zhizhong
论文数: 0引用数: 0
h-index: 0
机构:
Novartis Res Fdn, Genom Inst, Drug Discovery Oncol Grp, San Diego, CA 92121 USATsinghua Univ, Dept Math Sci, Beijing 100084, Peoples R China
Li, Zhizhong
Chen, Ting
论文数: 0引用数: 0
h-index: 0
机构:
Tsinghua Univ, Bioinformat Div, TNLIST, Beijing 100084, Peoples R China
Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
Univ So Calif, Program Computat Biol & Bioinformat, Los Angeles, CA 90089 USATsinghua Univ, Dept Math Sci, Beijing 100084, Peoples R China
Chen, Ting
Zeng, Jianyang
论文数: 0引用数: 0
h-index: 0
机构:
Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing 100084, Peoples R ChinaTsinghua Univ, Dept Math Sci, Beijing 100084, Peoples R China