Mixture distribution modeling for scalable graph-based semi-supervised learning

被引:7
作者
Li, Zhi [1 ]
Li, Chaozhuo [1 ]
Yang, Liqun [1 ]
Yu, Philip S. [2 ]
Li, Zhoujun [1 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[2] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
基金
国家重点研发计划;
关键词
Semi-supervised Learning; Graph-based Learning; Mixture Distribution Modeling; REGULARIZATION; EFFICIENT;
D O I
10.1016/j.knosys.2020.105974
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph-based semi-supervised learning (SSL) has been widely investigated in recent works considering its powerful ability to naturally incorporate the diverse types of information and measurements. However, traditional graph-based SSL methods have cubic complexities and leading to low scalability. In this paper, we propose to perform graph-based SSL on mixture distribution components, named Mixture-distribution based Graph Smoothing (MGS), to address this challenge. Specifically, the intrinsic distributions of data are captured by a mixture density estimation model. A novel mixture-distribution based objective energy function is further proposed to incorporate few available annotations, which ensures the model complexity is irrelevant to the number of raw instances. The energy function can be simplified and effectively solved by viewing the instances and mixture components as the point clouds. Experiments on large datasets demonstrate the remarkable performance improvements and scalability of the proposed model, which proves the superiority of the MGS model. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:18
相关论文
共 50 条
[21]   Predicting Protein Function by Multi-Label Correlated Semi-Supervised Learning [J].
Jiang, Jonathan Q. ;
McQuay, Lisa J. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (04) :1059-1069
[22]  
Joachims T, 1999, MACHINE LEARNING, PROCEEDINGS, P200
[23]   GAR: An efficient and scalable graph-based activity regularization for semi-supervised learning [J].
Kilinc, Ozsel ;
Uysal, Ismail .
NEUROCOMPUTING, 2018, 296 :46-54
[24]  
Kipf TN, 2017, PROC INT C LEARN R
[25]   Partially Shared Adversarial Learning For Semi-supervised Multi-platform User Identity Linkage [J].
Li, Chaozhuo ;
Wang, Senzhang ;
Wang, Hao ;
Liang, Yanbo ;
Yu, Philip S. ;
Li, Zhoujun ;
Wang, Wei .
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, :249-258
[26]  
Li CZ, 2019, AAAI CONF ARTIF INTE, P996
[27]   SSDMV: Semi-supervised Deep Social Spammer Detection by Multi-View Data Fusion [J].
Li, Chaozhuo ;
Wang, Senzhang ;
He, Lifang ;
Yu, Philip S. ;
Liang, Yanbo ;
Li, Zhoujun .
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, :247-256
[28]  
Liu W., 2010, PROC INT C MACH LEAR, P679
[29]  
Liu YX, 2010, PROCEEDINGS OF THE 2010 INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENCE AND AWARENESS INTERNET, AIAI2010, P354, DOI 10.1049/cp.2010.0786
[30]  
Loosli G., 2007, LARGE SCALE KERNEL M, P301