# Optimize¶

Optimization methods for learning dictionaries.

## Standard algorithms¶

Algorithms for training on complete data (ie. when you don’t need to mask your data). These are the algorithms needed for most use cases.

dictlearn.optimize.ksvd(signals, dictionary, iters, n_nonzero=0, omp_tol=0, tol=0, verbose=False, n_threads=1, retcodes=False)

Iterative batch algorithm [1, 2] for fitting a dictionary D to a set of signals X. Each iteration consists of two stages:

1. Fix D. Find sparse codes A such that X is approx equal DA
2. Fix the sparse codes. Find D_new such that norm(X - D_new*A) < norm(X - DA)

Need one of (or both) n_nonzero and omp_tol different from zero. If n_nonzero > 0 and omp_tol == 0 then KSVD finds an approximate solution to:

$\underset{\mathbf{D, A}}{min}\quad \left \| \mathbf{X - DA} \right \|_F^2 \text{ such that } \left \| \mathbf{A} \right \|_0 \leq n\_nonzero$

If omp_tol is not None then KSVD finds an approximate solution to:

$\underset{\mathbf{D, A}}{argmin}\quad\left \| \mathbf{A} \right \|_0 \text{ such that } \left \| \mathbf{X - DA} \right \|_F^2 \leq omp\_tol$
>>> import dictlearn as dl
>>> from numpy import linalg as LA
>>> dictionary = dl.random_dictionary(8*8, 128)
>>> sparse_1 = dl.omp_batch(image, dictionary, 10)
>>> new_dict, _ = dl.ksvd(image, dictionary, 20, 10)
>>> err_initial = LA.norm(image - dictionary.dot(sparse_1))
>>> sparse_2 = dl.omp_batch(image, new_dict, 10)
>>> err_trained = LA.norm(image - new_dict.dot(sparse_2))
>>> assert err_trained < err_initial

Parameters: signals – Training signals. One signal per column numpy.ndarray of shape (signal_size, n_signals) dictionary – Initial dictionary, shape (signal_size, n_atoms) iters – Max number of iterations n_nonzero – Default 0. Max nonzero coefficients in sparse decomposition omp_tol – Default 0. Tolerance of sparse approximation. Overrides n_nonzero tol – Default 0. Stop learning if norm(signals - dictionary.dot(sparse_codes) < tol verbose – Print progress n_threads – Default 1. Number of threads to use for sparse coding. retcodes – Return sparse codes from last iteration dictionary[, sparse decomposition if retcodes = True]
dictlearn.optimize.odl(signals, dictionary, iters=1000, n_nonzero=10, tol=0, verbose=False, batch_size=1, n_threads=1, seed=None)

Online dictionary learning algorithm

This algorithm sparsely encode one training signal at the time and updates the dictionary given this signal. The number if iterations also determines how many of the training signals are used. If the number of iterations is less than the number of signals, then iters signals is drawn at random. If iters is equal to the number of signals all signals are used in a random order.

The dictionary atoms are updated using block-coordinate descent. See  for details

Parameters: signals – Training signals. One signal per column numpy.ndarray of shape (signal_size, n_signals) dictionary – Initial dictionary, shape (signal_size, n_atoms) iters – Default 1000. Number of training iterations to use. This is also equal to the number of signals used in training. n_nonzero – Default 10. Max nonzero coefficients in sparse decomposition tol – Default 0. Tolerance of sparse approximation. Overrides n_nonzero verbose – Print progress batch_size – The number of signals to use for each dictionary update seed – Seed the drawing of random signals Trained and improved dictionary
dictlearn.optimize.mod(signals, dictionary, n_nonzero, iters, n_threads=1)

Method of optimal directions

The first alternate minimization algorithm  for dictionary learning.

1. Find sparse codes A given signals X and dictionary D
2. Update D given the new A by approximately solving for D in X = DA. That is D = X*pinv(A), with pinv(A) = A.T*(A*A.T)^-1
Parameters: signals – Training signals dictionary – Initial dictionary n_nonzero – Sparsity target for signal approximation iters – Number of dictionary update iterations n_threads – Default 1. Number of threads to use for sparse coding step New dictionary

Use these algorithms when you need to explicitly mark which data points to use and which to discard/ignore. All masks should have the same shape as the training data, with values [0, 1]. A data point is ignored if 0.

dictlearn.optimize.itkrmm(signals, masks, dictionary, n_nonzero, iters, low_rank=None, verbose=False)

Train a dictionary from corrupted image patches.

Need signals and masks of same shape. Data point signals[i, j] is used if the corresponding point in the mask, masks[i, j] == True. All points signals[i, j] with masks[i, j] == False are ignored.

See  for details.

Parameters: signals – Corrupted image patches, shape (patch_size, n_patches) masks – Binary mask for signals, same shape as signal. dictionary – Initial dictionary (patch_size, n_atoms) n_nonzero – Number of nonzero coeffs to use for training iters – Max number of iterations low_rank – Matrix of low rank components, shape (patch_size, n_low_rank) verbose – Print progress Dictionary. Shape (patch_size, n_atoms + n_low_rank)
dictlearn.optimize.reconstruct_low_rank(signals, masks, n_low_rank, initial=None, iters=10)

Reconstruct low rank components from image patches, by ITKrMM.

Low rank components or atoms capture low rank signal features. In the case where signals are image patches low rank atoms can capture average intensities and low variance features in the image. When these are included in a dictionary most of the signals will use at least one of the low rank atoms leaving the normal atoms to represent more specific image features

Parameters: signals – Image patches masks – Masks for image patches n_low_rank – Number of low rank components to reconstruct initial – Initial low rank dictionary, shape (signals.shape, n_low_rank) iters – Number of iterations for each component Low rank dictionary, shape (signals.shape, n_low_rank)