Most of the matrix is unseen. Can you restore it faithfully? How close will you get to the truth? This problem shows up all over:
🔹 Recommendation systems
🔹 Spam-filtering pipelines
🔹 Image inpainting
🔹 Video-surveillance background modeling
🔹 Link prediction in social networks
🔹 Seismic-survey interpolation
💡 Key Idea
Assume the full matrix is low-rank (i.e. has only a few underlying patterns). This drastically reduces the number of unknowns. Ideally, we'd find the matrix X with smallest rank that agrees with observations. Unfortunately, rank minimization is hard due to nonconvexity... However, in many cases, it suffices to use the simplest convex model based on this idea of rank minimization.
🧩 Matrix Completion Model
For a partially observed matrix M, estimate the entire matrix via the solution X of the problem:
minimize nuclear norm of X
such that
X_ij = M_ij for all observed M_ij
In very general settings, one can perfectly recover all of the missing entries from most sufficiently large subsets by solving a convex programming problem that finds the matrix with the minimum nuclear norm agreeing with the observed entries.
-Candès & Recht
🎯 Practical benefits
- Clean, tractable formulation
- Scales to millions of entries
- No manual model-complexity tuning
📚 Further Reading
- Candès & Recht. Exact Matrix Completion via Convex Optimization. 2012
- Candès & Tao. The Power of Convex Relaxation: Near-Optimal Matrix Completion. 2010
- Candès & Plan. Matrix Completion With Noise. 2010
- Recht. A Simpler Approach to Matrix Completion. 2011
p.s. The diagram above shows partial observations of a 5x5 matrix M, which has rank 3. The full matrix shown as X*.
p.p.s. Ben Recht, referenced above, also has a newsletter.