Optimize¶

Optimization methods for learning dictionaries.

Standard algorithms¶

Algorithms for training on complete data (ie. when you don’t need to mask your data). These are the algorithms needed for most use cases.

dictlearn.optimize.ksvd(signals, dictionary, iters, n_nonzero=0, omp_tol=0, tol=0, verbose=False, n_threads=1, retcodes=False)¶

Iterative batch algorithm [1, 2] for fitting a dictionary D to a set of signals X. Each iteration consists of two stages:

Fix D. Find sparse codes A such that X is approx equal DA

Fix the sparse codes. Find D_new such that norm(X - D_new*A) < norm(X - DA)

Need one of (or both) n_nonzero and omp_tol different from zero. If n_nonzero > 0 and omp_tol == 0 then KSVD finds an approximate solution to:

\[\underset{\mathbf{D, A}}{min}\quad \left \| \mathbf{X - DA} \right \|_F^2 \text{ such that } \left \| \mathbf{A} \right \|_0 \leq n\_nonzero\]

If omp_tol is not None then KSVD finds an approximate solution to:

\[\underset{\mathbf{D, A}}{argmin}\quad\left \| \mathbf{A} \right \|_0 \text{ such that } \left \| \mathbf{X - DA} \right \|_F^2 \leq omp\_tol\]

>>> import dictlearn as dl
>>> from numpy import linalg as LA
>>> image = dl.Patches(dl.imread('some-image'), 8).patches
>>> dictionary = dl.random_dictionary(8*8, 128)
>>> sparse_1 = dl.omp_batch(image, dictionary, 10)
>>> new_dict, _ = dl.ksvd(image, dictionary, 20, 10)
>>> err_initial = LA.norm(image - dictionary.dot(sparse_1))
>>> sparse_2 = dl.omp_batch(image, new_dict, 10)
>>> err_trained = LA.norm(image - new_dict.dot(sparse_2))
>>> assert err_trained < err_initial

Parameters:

signals – Training signals. One signal per column numpy.ndarray of shape (signal_size, n_signals)
dictionary – Initial dictionary, shape (signal_size, n_atoms)
iters – Max number of iterations
n_nonzero – Default 0. Max nonzero coefficients in sparse decomposition
omp_tol – Default 0. Tolerance of sparse approximation. Overrides n_nonzero
tol – Default 0. Stop learning if norm(signals - dictionary.dot(sparse_codes) < tol
verbose – Print progress
n_threads – Default 1. Number of threads to use for sparse coding.
retcodes – Return sparse codes from last iteration

Returns:

dictionary[, sparse decomposition if retcodes = True]

dictlearn.optimize.odl(signals, dictionary, iters=1000, n_nonzero=10, tol=0, verbose=False, batch_size=1, n_threads=1, seed=None)¶

Online dictionary learning algorithm

This algorithm sparsely encode one training signal at the time and updates the dictionary given this signal. The number if iterations also determines how many of the training signals are used. If the number of iterations is less than the number of signals, then iters signals is drawn at random. If iters is equal to the number of signals all signals are used in a random order.

The dictionary atoms are updated using block-coordinate descent. See [4] for details

Parameters:

signals – Training signals. One signal per column numpy.ndarray of shape (signal_size, n_signals)
dictionary – Initial dictionary, shape (signal_size, n_atoms)
iters – Default 1000. Number of training iterations to use. This is also equal to the number of signals used in training.
n_nonzero – Default 10. Max nonzero coefficients in sparse decomposition
tol – Default 0. Tolerance of sparse approximation. Overrides n_nonzero
verbose – Print progress
batch_size – The number of signals to use for each dictionary update
seed – Seed the drawing of random signals

Returns:

Trained and improved dictionary

dictlearn.optimize.mod(signals, dictionary, n_nonzero, iters, n_threads=1)¶

Method of optimal directions

The first alternate minimization algorithm [3] for dictionary learning.

Find sparse codes A given signals X and dictionary D
Update D given the new A by approximately solving for D in X = DA. That is D = X*pinv(A), with pinv(A) = A.T*(A*A.T)^-1

Parameters:	signals – Training signals dictionary – Initial dictionary n_nonzero – Sparsity target for signal approximation iters – Number of dictionary update iterations n_threads – Default 1. Number of threads to use for sparse coding step
Returns:	New dictionary

Masked data¶

Use these algorithms when you need to explicitly mark which data points to use and which to discard/ignore. All masks should have the same shape as the training data, with values [0, 1]. A data point is ignored if 0.

dictlearn.optimize.itkrmm(signals, masks, dictionary, n_nonzero, iters, low_rank=None, verbose=False)¶

Train a dictionary from corrupted image patches.

Need signals and masks of same shape. Data point signals[i, j] is used if the corresponding point in the mask, masks[i, j] == True. All points signals[i, j] with masks[i, j] == False are ignored.

See [5] for details.

Parameters:

signals – Corrupted image patches, shape (patch_size, n_patches)
masks – Binary mask for signals, same shape as signal.
dictionary – Initial dictionary (patch_size, n_atoms)
n_nonzero – Number of nonzero coeffs to use for training
iters – Max number of iterations
low_rank – Matrix of low rank components, shape (patch_size, n_low_rank)
verbose – Print progress

Returns:

Dictionary. Shape (patch_size, n_atoms + n_low_rank)

dictlearn.optimize.reconstruct_low_rank(signals, masks, n_low_rank, initial=None, iters=10)¶

Reconstruct low rank components from image patches, by ITKrMM.

Low rank components or atoms capture low rank signal features. In the case where signals are image patches low rank atoms can capture average intensities and low variance features in the image. When these are included in a dictionary most of the signals will use at least one of the low rank atoms leaving the normal atoms to represent more specific image features

Parameters:	signals – Image patches masks – Masks for image patches n_low_rank – Number of low rank components to reconstruct initial – Initial low rank dictionary, shape (signals.shape[0], n_low_rank) iters – Number of iterations for each component
Returns:	Low rank dictionary, shape (signals.shape[0], n_low_rank)

References¶

[1] M. Aharon, M. Elad, and A. Bruckstein, “The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representation,” IEEE Trans. on Signal Processing, vol. 54, no. 11, pp. 4311–4322, 2006.

[2] Rubinstein, Ron, Michael Zibulevsky, and Michael Elad. “Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit.” Cs Technion 40.8 (2008): 1-15.

[3] Engan, Kjersti, Sven Ole Aase, and J. Hakon Husoy. “Method of optimal directions for frame design.” Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on. Vol. 5. IEEE, 1999.

[4] Mairal, Julien, et al. “Online dictionary learning for sparse coding.” Proceedings of the 26th annual international conference on machine learning. ACM, 2009.

[5] Naumova, Valeriya, and Karin Schnass. “Dictionary learning from incomplete data.” arXiv preprint arXiv:1701.03655 (2017).