Learning to Optimize on Riemannian Manifolds

被引:8
作者
Gao, Zhi [1 ]
Wu, Yuwei [1 ,2 ]
Fan, Xiaomeng [1 ]
Harandi, Mehrtash [3 ,4 ]
Jia, Yunde [1 ,2 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Lab Intelligent Informat Technol, Beijing 100081, Peoples R China
[2] Shenzhen MSU BIT Univ, Guangdong Lab Machine Percept & Intelligent Comp, Shenzhen 518172, Guangdong, Peoples R China
[3] Monash Univ, Dept Elect & Comp Syst Eng, Melbourne, Vic 3800, Australia
[4] Data61 CSIRO, Clayton, Vic 3169, Australia
关键词
Optimization; Manifolds; Training; Task analysis; Geometry; Trajectory; Stochastic processes; Riemannian optimization; meta-optimization; meta-learning; Riemannian manifolds; GEOMETRY; ALGORITHMS;
D O I
10.1109/TPAMI.2022.3215702
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many learning tasks are modeled as optimization problems with nonlinear constraints, such as principal component analysis and fitting a Gaussian mixture model. A popular way to solve such problems is resorting to Riemannian optimization algorithms, which yet heavily rely on both human involvement and expert knowledge about Riemannian manifolds. In this paper, we propose a Riemannian meta-optimization method to automatically learn a Riemannian optimizer. We parameterize the Riemannian optimizer by a novel recurrent network and utilize Riemannian operations to ensure that our method is faithful to the geometry of manifolds. The proposed method explores the distribution of the underlying data by minimizing the objective of updated parameters, and hence is capable of learning task-specific optimizations. We introduce a Riemannian implicit differentiation training scheme to achieve efficient training in terms of numerical stability and computational cost. Unlike conventional meta-optimization training schemes that need to differentiate through the whole optimization trajectory, our training scheme is only related to the final two optimization steps. In this way, our training scheme avoids the exploding gradient problem, and significantly reduces the computational load and memory footprint. We discuss experimental results across various constrained problems, including principal component analysis on Grassmann manifolds, face recognition, person re-identification, and texture image classification on Stiefel manifolds, clustering and similarity learning on symmetric positive definite manifolds, and few-shot learning on hyperbolic manifolds.
引用
收藏
页码:5935 / 5952
页数:18
相关论文
共 104 条
  • [1] Trust-region methods on Riemannian manifolds
    Absil, P-A.
    Baker, C. G.
    Gallivan, K. A.
    [J]. FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2007, 7 (03) : 303 - 330
  • [2] Riemannian geometry of Grassmann manifolds with a view on algorithmic computation
    Absil, PA
    Mahony, R
    Sepulchre, R
    [J]. ACTA APPLICANDAE MATHEMATICAE, 2004, 80 (02) : 199 - 220
  • [3] Absil PA, 2010, Found. Comput. Math., V10, P241
  • [4] Alimisis F., 2020, ARXIV200204144
  • [5] Andrychowicz M, 2016, ADV NEUR IN, V29
  • [6] [Anonymous], 2019, P INT C MACH LEARN
  • [7] [Anonymous], 2019, P INT C LEARN REPR, DOI DOI 10.1145/3316781.3317757
  • [8] [Anonymous], 2019, ADV NEURAL INFORM PR, DOI DOI 10.1016/B978-0-12-815053-5.00008-8
  • [9] [Anonymous], 2018, ARXIV180602812
  • [10] [Anonymous], 2019, P INT C LEARN REPR