# Sparse Coding¶

The goal of these algorithms is to find a sparse coefficient matrix (or vector) for some signals given a dictionary of signal features.

The two norms are defined as:

$$\left \| \mathbf{x} \right \|_0 = \#\{\mathbf{x}_i \neq 0: \forall \mathbf{x}_i \in \mathbf{x}\}$$

$$\left \| \mathbf{x} \right \|_1 = \sum_i |x_i|$$

Thus the two problem formulations are:

$\hat{a} = \underset{a}{argmin} \quad \frac{1}{2} \left \| x - Da \right \| _F^2 + \lambda \left \| a \right \|_0$
$\hat{a} = \underset{a}{argmin} \quad \frac{1}{2} \left \| x - Da \right \| _F^2 + \lambda \left \| a \right \|_1$

With signal $$x$$, dictionary $$D$$, and $$\hat{a}$$ the sparse coefficients

## $$\ell_0$$-regularization¶

dictlearn.sparse.omp_batch(signals, dictionary, n_nonzero=10, tol=0, n_threads=1)

Batch Orthogonal Matching Pursuit. A more effective version than omp_cholesky if the number of signals is high. Saves time and calculations by doing some pre-computations. See  for details

>>> import dictlearn as dl
>>> image_patches = dl.Patches(image, 8)
>>> sparse_codes = dl.omp_batch(image_patches, dictionary, n_nonzero=4)
>>> sparse_approx = dictionary.dot(sparse_codes)

Parameters: signals – Signals to encode. numpy.ndarray shape (signal_size, n_signals) or dictlearn.preprocess.Patches dictionary – ndarray, shape (signal_size, n_atoms) n_nonzero – Default 10. Max number of nonzero coeffs for sparse codes tol – Default 0. Add nonzero coeffs until norm(signal - dict*sparse_code) < tol n_threads – Default 1. Number of threads to use. Sparse codes, shape (n_atoms, n_signals)
dictlearn.sparse.omp_cholesky(signals, dictionary, n_nonzero=10, tol=0, n_threads=1)

Cholesky Orthogonal Matching pursuits. Use omp_batch if many signals need to be sparse coded. See  for details

>>> import dictlearn as dl
>>> image_patches = dl.Patches(image, 8)
>>> sparse_codes = dl.omp_cholesky(image_patches, dictionary, n_nonzero=4)
>>> sparse_approx = dictionary.dot(sparse_codes)

Parameters: signals – Signals to sparse code, shape (signal_size,) or (signal_size, n_signals) dictionary – ndarray, shape (signal_size, n_atoms) n_nonzero – Default 10. Max number of nonzero coeffs for sparse codes tol – Default 0. Add nonzero coeffs until norm(signal - dict*sparse_code) < tol n_threads – Default 1. Number of threads to use. Sparse codes, shape (n_atoms, n_signals)
dictlearn.sparse.omp_mask(signals, masks, dictionary, n_nonzero=None, tol=1e-06, n_threads=1, verbose=False)
Orthogonal Matching Pursuit for masked data. Tries to reconstruct the full

set of signals, by ignoring data points signals[i, j] where mask[i, j] == 0.

>>> import dictlearn as dl
>>> # Create patches from broken_image and mask + get dictionary
>>> reconstructed_image_patches = dictionary.dot(sparse_codes)

Parameters: signals – Corrupted signals, shape (size, n_signals) masks – Masks, shape (size, n_signals). masks[:, i] is the mask for signals[:, i] dictionary – Trained dictionary, shape (size, n_atoms) n_nonzero – Default None. Max number of nonzero coeffs to use tol – Default 1e-6. Stop if signal approximation is within the accuracy. Overwrites n_nonzero n_threads – Not used verbose – Default False. Print progress Sparse approximation to signals, shape (n_atoms, n_signals)
dictlearn.sparse.iterative_hard_thresholding(signal, dictionary, iters, step_size, initial, n_nonzero=None, penalty=None)

If penalty make sure sqrt(2*penalty) is not bigger than all elements in initial_a, if that’s the case the solution is just the zero vector

Requires tuning of the hyper parameters step_size, initial_a and penalty. optimize.omp_cholesky may be a better choice.

If reg_param is supplied this solves:

min 0.5*||signal - D*alpha||_2^2 + reg_param||alpha||_0

If n_nonzero is supplied then the following is solved for alpha

min || signal - D*alpha||_2^2 such that ||alpha||_0 <= n_nonzero
Parameters: signal – Signal in R^m dictionary – Dictionary (D) in R^(m,p) iters – Number of iterations step_size – Step size for gradient descent step initial – Initial sparse coefficients. Need ||initial_a||_0 <= nonzero if n_nonzero not None n_nonzero – Sparsity target penalty – Penalty parameter Sparse decomposition of signal

## $$\ell_1$$-regularization¶

dictlearn.sparse.lars(signals, dictionary, n_nonzero=0, alpha=0, lars_params=None, **kwargs)

“Homotopy” algorithm for solving the Lasso

argmin 0.5*||X - DA||_2^2 + r*||A||_1

for all r.

This algorithm is supposedly the most accurate for l1 regularization.

This is terribly slow, and not very accurate. ~20x slower than OMP. Find this strange as OMP solves a NP-Hard problem and this a convex

Parameters: signals – Signals to encode. Shape (signal_size, n_signals) or (signal_size,) dictionary – Dictionary, shape (signal_size, n_atoms) n_nonzero – Number of nonzero coefficients to use alpha – Regularization parameter. Overwrites n_nonzero lars_params – See sklearn.linear_models.LassoLars docs kwargs – Not used. Just to make calling API for all regularization algorithms the same Sparse codes, shape (n_atoms, n_signals) or (n_atoms,)
dictlearn.sparse.lasso(signals, dictionary, alpha, lasso_params=None, **kwargs)
Parameters: signals – Signals to encode, shape (signal_size,) or (signal_size, n_signals) dictionary – Dictionary, shape (signal_size, n_atoms) alpha – Regularization parameter. <~0.9 yields more accurate results than OMP, but slower lasso_params – Other parameters. See sklearn.linear_model.Lasso kwargs – Not used, just for making calling compatible with other sparse coding methods Sparse codes, shape (n_atoms,) or (n_atoms, n_signals)
dictlearn.sparse.iterative_soft_thresholding(signal, dictionary, initial, reg_param=None, n_nonzero=None, step_size=0.1, iters=10)

l1 reg using iterative soft thresholding

if regularization parameter is given solve:

min_a 1/2 * || x - D*alpha||_2^2 + reg_param*||alpha||_1

if number of nonzero is given solve:

min_a ||signal - D*alpha||_2^2 such that ||alpha||_1 <= n_nonzero

This method requires tuning of the initial value for alpha, step_size and iters/res_param.

Parameters: signal – Signal, shape (signal_size, ) dictionary – Dictionary, shape (signal_size, n_atoms) initial – Initial sparse codes, shape (n_atoms, ) reg_param – Regularization parameter n_nonzero – Max number of nonzero coeffs step_size – Gradient descent step size iters – Number of iterations Sparse codes, shape (n_atoms, )
dictlearn.sparse.l1_ball(vector, target)
Create sparse approximation of ‘vector’ by projecting onto to l1-ball keeping at most ‘target’ coefficients active
Parameters: vector – Vector to project to sparse target – Max number of nonzero coefficients Sparse vector, same shape as ‘vector’