Preprocessing¶
-
class
dictlearn.preprocess.
Patches
(image, size, stride=1, max_patches=None, random=None, order='C')¶ -
REMOVE_MEAN
= 'remove_mean'¶ Generate and reconstruct image patches
Parameters: - image – ndarray, 2D or 3D
- size – Patch size, since all patches are square (cube) this is just the size of the first dimension. Ie 8 for (8, 8) patches
- stride – Stride/distance between patches in image. Can be int or list type. If int then the stride is the same in every dimension. If list then each stride[i] denotes the stride on axis i. Patches cannot be reconstructed if the stride in one dimension is larger the than the patch size in the same dimension. Ie. stride[i] > size[i] for any i
- max_patches – Maximum number of patches
- random – True for taking patches from random locations in image. Overwritten if max_patches=None
- order – C or F for C or FORTRAN order on underlying data
-
check_batch_size_or_raise
(batch_size)¶ Check if there’s enough memory to store ‘batch_size’ patches. Raise MemoryError if not
-
generator
(batch_size, callback=False)¶ Create and reconstruct a batch iteratively.
If Patches.patches is too large to keep all in memory use this. Only ‘batch_size’ patches are generated. This requires approximately ‘batch_size’ times less memory. If batch_size is 100 and Patches.patches need 100 memories then this need only one memory.
>>> import numpy as np >>> volume = np.load('some_image.npy') >>> size, stride = [10, 10, 10], [1, 1, 1] >>> patches = Patches(volume, size, stride) >>> for batch in patches.generator(100): >>> # Handle batch >>> assert batch.shape[1] == 100 >>> assert batch.shape[0] == 1000, 'Can fail at last batch, see stride'
One matrix of size (patch_size, batch_size) is created per iteration. This generator return (batch, callback) with batch a numpy array of shape (patch_size, batch_size) and callback(batch) reconstruct the part of the volume which contains the given batch. It is required that the argument to callback has shape identical to the batch returned
This can be used if Patches3D.create() requires too much memory. The amount of memory required by this method is batch_size*size[0]*size[1]*size[2]*volume.dtype.itemsize bytes
>>> import numpy as np >>> volume = np.load('some_image_volume.npy') >>> size, stride = [10, 10, 10], [1, 1, 1] >>> patches = Patches(volume, size, stride) >>> for batch, reconstruct in patches.generator(100, callback=True): >>> # Handle batch, here we do nothing >>> reconstruct(batch) >>> assert np.array_equal(volume, patches.reconstructed)
Parameters: - batch_size – Size of batches. The last batch can be smaller if n_patches % batch_size != 0
- callback – If True a callback function ‘callback(batch)’ is returned such the the image can be partially reconstructed
Returns: Generator
-
n_patches
¶ Returns: Number of patches
-
patches
¶ Returns: Image patches, shape (size[0]*size[1]*…, n_patches)
-
reconstruct
(new_patches, save=False)¶ Reconstruct the image with new_patches. Overlapping regions are averaged. The reconstructed patches are not saved by default
self.patches are the same object before and after this method is called, as long as save=False
Parameters: - new_patches – ndarray (patch_size, n_patches). Patches returned from Patches.patches
- save – Overwrite current patches with new_patches
Returns: Reconstructed image
-
remove_mean
(add_back=True)¶ Remove the mean from every patch, this is automatically added back if the image is reconstructed
Parameters: add_back – Automatically add back the mean to patches on reconstruction
-
shape
¶ Shape of patch matrix, (patch_size, n_patches)
-
size
¶ Size of patches
-
-
class
dictlearn.preprocess.
Patches3D
(volume, size, stride)¶ Create and reconstruct image patches from 3D volume.
Parameters: - volume – 3D ndarray
- size – Size of image patches, (x, y, z)
- stride – Stride between each patch, (i, j, k). ‘volume’ cannot be reconstructed if i > x, j > y or k > z
-
create_batch_and_reconstruct
(batch_size)¶ Create and reconstruct a batch iteratively.
One matrix of ‘batch_size’ is created per iteration. This generator return (batch, callback) with batch a numpy array of shape (n, batch_size) and callback(batch) reconstruct the part of the volume which contains the given batch.
This can be used if Patches3D.create() requires too much memory. The amount of memory required by this method is batch_size*size[0]*size[1]*size[2]*volume.dtype.itemsize bytes
>>> import numpy as np >>> import dictlearn as dl >>> dictionary = np.load('some_dictionary.npy') >>> volume = np.load('some_image_volume.npy') >>> size, stride = [1, 1, 1], [1, 1, 1]
>>> patches = Patches3D(volume, size, stride) >>> for batch, reconstruct in patches.create_batch_and_reconstruct(100): >>> new_batch = dl.omp_batch(batch, dictionary) >>> reconstruct(new_batch)
>>> reconstructed_volume = patches.reconstructed
Parameters: batch_size – Number of patches per batch. Returns: Generator, next() returns (batch, reconstruct(new_batch)
-
next_batch
(batch_size)¶ Parameters: batch_size – Number of image patches per batch Returns: Generator, next() returns a ndarray of shape (n, batch_size)
-
dictlearn.preprocess.
center
(data, dim=0, retmean=False, inplace=False)¶ Remove the mean at dim from every patch
Parameters: - data – ndarray, data to center
- dim – Dimension to calculate mean, default 0 (columns)
- retmean – Return mean if True
- inplace – Change argument data directly if True, returns mean only
Returns: Centered patches and mean if retmean is True. Or just mean if inplace is True
-
dictlearn.preprocess.
normalize
(patches, lim=0.2)¶ L2 normalization. If l2 norm of a patch is smaller than lim the the patch is divided element wise by lim
Parameters: - patches – ndarray, (size, n_patches)
- lim – Threshold for low intensity patches
Returns: