Abstract
Projection-free optimization via different variants of the Frank-Wolfe method has become one of the cornerstones of large scale optimization for machine learning and computational statistics. Numerous applications within these fields involve the minimization of functions with self-concordance like properties. Such generalized self-concordant functions do not necessarily feature a Lipschitz continuous gradient, nor are they strongly convex, making them a challenging class of functions for first-order methods. Indeed, in a number of applications, such as inverse covariance estimation or distance-weighted discrimination problems in binary classification, the loss is given by a generalized self-concordant function having potentially unbounded curvature. For such problems projection-free minimization methods have no theoretical convergence guarantee. This paper closes this apparent gap in the literature by developing provably convergent Frank-Wolfe algorithms with standard O(1/k) convergence rate guarantees. Based on these new insights, we show how these sublinearly convergent methods can be accelerated to yield linearly convergent projection-free methods, by either relying on the availability of a local liner minimization oracle, or a suitable modification of the away-step Frank-Wolfe method.
Original language | English |
---|---|
Pages (from-to) | 255-323 |
Number of pages | 69 |
Journal | Mathematical Programming |
Volume | 198 |
Issue number | 1 |
Early online date | 29 Jan 2022 |
DOIs | |
Publication status | Published - Mar 2023 |
Keywords
- COMPLEXITY
- CONVERGENCE
- CONVEX
- Convex programming
- Frank-Wolfe algorithm
- GRADIENT-METHOD
- Generalized self-concordant functions