Abstract
Projection-free optimization via different variants of the Frank-Wolfe method has become one of the cornerstones of large scale optimization for machine learning and computational statistics. Numerous applications within these fields involve the minimization of functions with self-concordance like properties. Such generalized self-concordant functions do not necessarily feature a Lipschitz continuous gradient, nor are they strongly convex, making them a challenging class of functions for first-order methods. Indeed, in a number of applications, such as inverse covariance estimation or distance-weighted discrimination problems in binary classification, the loss is given by a generalized self-concordant function having potentially unbounded curvature. For such problems projection-free minimization methods have no theoretical convergence guarantee. This paper closes this apparent gap in the literature by developing provably convergent Frank-Wolfe algorithms with standard O(1/k) convergence rate guarantees. Based on these new insights, we show how these sublinearly convergent methods can be accelerated to yield linearly convergent projection-free methods, by either relying on the availability of a local liner minimization oracle, or a suitable modification of the away-step Frank-Wolfe method.
| Original language | English |
|---|---|
| Pages (from-to) | 255-323 |
| Number of pages | 69 |
| Journal | Mathematical Programming |
| Volume | 198 |
| Issue number | 1 |
| Early online date | 29 Jan 2022 |
| DOIs | |
| Publication status | Published - Mar 2023 |
Keywords
- COMPLEXITY
- CONVERGENCE
- CONVEX
- Convex programming
- Frank-Wolfe algorithm
- GRADIENT-METHOD
- Generalized self-concordant functions
Fingerprint
Dive into the research topics of 'Generalized self-concordant analysis of Frank-Wolfe algorithms'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver