Any convex and L-smooth loss can become a plug-and-play operator block. Modularity, out of the box.
Baillon and Haddad's theorem says if a convex function’s gradient is L-Lipschitz, then its gradient is also (1/L)-cocoercive. In simpler terms, differences in gradients not only stay bounded, but also align with the direction you move in. This theorem bridges results for convex functions to results for monotone operators.
Why it matters:
Baillon–Haddad maps an L-smooth function into a drop-in monotone operator.
Monotone blocks power algorithms: proximal-gradient, Condat-Vu, Davis-Yin.
No custom convergence proofs per new cost.
Inherit O(1/k) convergence rates.
Slides for a quick lecture below 👇
Enjoy!