Adaptive Greedy Dictionary Selection for Web Media Summarization

被引:49
作者
Cong, Yang [1 ,2 ]
Liu, Ji [2 ]
Sun, Gan [1 ]
You, Quanzeng [2 ]
Li, Yuncheng [2 ]
Luo, Jiebo [2 ]
机构
[1] Chinese Acad Sci, Shenyang Inst Automat, State Key Lab Robot, Shenyang 110016, Peoples R China
[2] Univ Rochester, Dept Comp Sci, Rochester, NY 14611 USA
关键词
Sparse representation; l(0) norm; dictionary learning; dictionary selection; forward-backward; greedy method; K-SVD; DANTZIG SELECTOR; SPARSE; ALGORITHM; FRAMEWORK; VIDEOS;
D O I
10.1109/TIP.2016.2619260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Initializing an effective dictionary is an indispensable step for sparse representation. In this paper, we focus on the dictionary selection problem with the objective to select a compact subset of basis from original training data instead of learning a new dictionary matrix as dictionary learning models do. We first design a new dictionary selection model via l(2,0) norm. For model optimization, we propose two methods: one is the standard forward-backward greedy algorithm, which is not suitable for large-scale problems; the other is based on the gradient cues at each forward iteration and speeds up the process dramatically. In comparison with the state-of-the-art dictionary selection models, our model is not only more effective and efficient, but also can control the sparsity. To evaluate the performance of our new model, we select two practical web media summarization problems: 1) we build a new data set consisting of around 500 users, 3000 albums, and 1 million images, and achieve effective assisted albuming based on our model and 2) by formulating the video summarization problem as a dictionary selection issue, we employ our model to extract keyframes from a video sequence in a more flexible way. Generally, our model outperforms the state-of-the-art methods in both these two tasks.
引用
收藏
页码:185 / 195
页数:11
相关论文
共 54 条
[1]   K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[2]   Video summaries and cross-referencing through mosaic-based representation [J].
Aner-Wolf, A ;
Kender, JR .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2004, 95 (02) :201-237
[3]  
[Anonymous], 2014, 2014 IEEE INT C MULT
[4]  
[Anonymous], 2009, Advances in Neural Information Processing Systems
[5]  
[Anonymous], 2011, P IEEE IICPE C JAN I
[6]  
[Anonymous], 2010, Proceedings of the 27th International Conference on International Conference on Machine Learning
[7]  
[Anonymous], 2010, P 27 INT C INT C MAC
[8]   Robust uncertainty principles:: Exact signal reconstruction from highly incomplete frequency information [J].
Candès, EJ ;
Romberg, J ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (02) :489-509
[9]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[10]   Enhancing Sparsity by Reweighted l1 Minimization [J].
Candes, Emmanuel J. ;
Wakin, Michael B. ;
Boyd, Stephen P. .
JOURNAL OF FOURIER ANALYSIS AND APPLICATIONS, 2008, 14 (5-6) :877-905