# Passing Thoughts From Papers

### Currently Reading, or trying to read, or should read

- mezard & montanari - information, physics, and computation
- things on topological data analysis: understanding “the shape of information”
- Overcoming intractability in machine learning http://www.cs.princeton.edu/courses/archive/spring15/cos598D/
- Bubeck’s approach to mathematical optimization; also his mini-course on bandit theory
- the new jim crow which appears to be devastatingly on point
- better facility with convex analysis, bregman divergences, etc.
- mathematical statistics
- primal-dual online algorithms
- contextual bandits

## concepts which i am woefully unfamiliar which i need to become more familiar with

- interior point methods; slater conditions
- topology lol
- algebra lol
- real analysis

## Recht - minimum rank approximations fo rmatrix completion

Minimization of l1 norm is well-known heuristic for cardinality minimization [e.g. sparsity] => compressed sensing, whenever the sensing matrix has *basis incoherence* properties; e.g. when it is chosen according to specific ‘ensembles’ (?)

In fact l1 heuristic is special case of nuclear norm heuristic. Parallels between nuclear norm heuristic and l1: restricted isometry property: implies that nuclear norm heuristic guaranteed to provide minimum rank solution

General LMIs can be expressed as rank constriants on a block matrix: rank of block symmetric matrix is equal to the rank of a diagonal block plus the rank of its schur complement

Insofar as sparsity optimization problems in compressed sensing based on cardinality can be expressed via linear programming (l1 norm minimization), convex optimization (specifically SDPs) can be used for matrix completion.

** operator norm and nuclear norm of a matrix can be cast as SDPs

Recall relationships between the Frobenius, operator norms, nuclear norms. Related by the following inequalities:

In particular this chain of inequalities shows that , so the nuclear norm is a lower bound of the rank function on the unit ball in the operator norm.

If the row spaces and column spaces of the two matrices A, B are orthogonal, then [sufficient] that the nuclear norm is additive! The proof follows by partitioning the SVD of A, B …. => infer a valid SVD showing that the singular values are equal to the union (with repetition) of the singular values of A + B.

The main section of interest is “Restricted Isometry and Recovery of Low-Rank Matrices” – Consider parameters determining the behavior of a linear map $$\mathcal{A}$, restricted to matrices of rank r. In particular this generalizes the restricted isometry property from vectors [Candes, Tao] to matrices.

*** Restricted Isometry Property, def 3.1.***
for every integer r (rank < m), define the r-restricted isometry constant to be such that $(1 - \delta_r(\mathcal{A})) \vert\vert X \vert\vert_F \leq \vert\vert \mathcal{A}(X) \vert\vert \leq (1 + \delta_r(\mathcal{A}(X)\vert\vert \leq (1 + \delta_r(\mathcal{A})) \vert\vert X \vert\vert_F
$$
holds for all matrices X of rank at most r.
e.g. that the Frobenius norm of the linear map applied to any matrix of rank r or lower can be bounded between a multiplier interval of the induced 2-norm. (?)

The set of matrices X for which the restricted isometry property must hold is a generalized Stiefel manifold [algebraic variety]. Many ensembles of random matrices have RIP with small w.h.p.

Main recovery theorem: requires a technical lemma; for any two matrices A, B; can decompose B as the sum of two matrices with the rank of not too large; and satisfies additivity of the rank; e.g. can do from SVD.

=== Algorithms for solving

- Can solve exactly with interior point methods but these require a lot of memory for large problems.
- Projected subgradient projection - standard [subgradient] descent
- Low rank parametrization [e.g. matrix completion] Constraint is not convex because you end up multiplying matrices (the factors).