One-Class Collaborative Filtering

被引:678
作者
Pan, Rong [1 ]
Zhou, Yunhong [2 ]
Cao, Bin [3 ]
Liu, Nathan N. [3 ]
Lukose, Rajan [1 ]
Scholz, Martin [1 ]
Yang, Qiang [3 ]
机构
[1] HP Labs, 1501 Page Mill Rd, Palo Alto, CA 94304 USA
[2] Rocket Fuel Inc, Redwood Shores, CA 94065 USA
[3] Hong Kong Univ Sci & Technol, Kowloon, Hong Kong, Peoples R China
来源
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2008年
关键词
D O I
10.1109/ICDM.2008.16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many applications of collaborative filtering (CF), such as news item recommendation and bookmark recommendation, are most naturally thought of as one-class collaborative filtering (OCCF) problems. In these problems, the training data usually consist simply of binary data reflecting a user's action or inaction, such as page visitation in the case of news item recommendation or webpage bookmarking in the bookmarking scenario. Usually this kind of data are extremely sparse (a small fraction are positive examples), therefore ambiguity arises in the interpretation of the non-positive examples. Negative examples and unlabeled positive examples are mixed together and we are typically unable to distinguish them. For example, we cannot really attribute a user not bookmarking a page to a lack of interest or lack of awareness of the page. Previous research addressing this one-class problem only considered it as a classification task. In this paper, we consider the one-class problem under the CF setting. We propose two frameworks to tackle OCCF. One is based on weighted low rank approximation the other is based on negative example sampling. The experimental results show that our approaches significantly outperform the baselines.
引用
收藏
页码:502 / +
页数:3
相关论文
共 30 条
[1]   Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions [J].
Adomavicius, G ;
Tuzhilin, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (06) :734-749
[2]  
Azar Y., 2001, P 33 ANN ACM S THEOR, P619, DOI [10.1145/380752.380859, DOI 10.1145/380752.380859]
[3]  
Batista G.E.A.P.A., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
[4]   Learning distributions by their density levels: A paradigm for learning without a teacher [J].
BenDavid, S ;
Lindenbaum, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :171-182
[5]  
Bhavani Raskutti, 2004, ACM Sigkdd Explor Newsl, V6, P60
[6]   Building text classifiers using positive and unlabeled examples [J].
Bing, L ;
Yang, D ;
Li, XL ;
Lee, WS ;
Yu, PS .
THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, :179-186
[7]  
Breese J. S., 1998, UAI, P43, DOI 10.5555/2074094.2074100
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]  
Chawla N. V., 2004, ACM Sigkdd Explorations Newsletter, V6, P1, DOI [DOI 10.1145/1007730.1007733, 10.1145/1007730.1007733]
[10]  
Das Abhinandan S., 2007, P 16 INT C WORLD WID, P271