public class SteerablePyramid
extends java.lang.Object
This implementation of the steerable pyramid transform performs a multi-scale, multi-orientation decomposition of an input image through application of radial and directional filters in wavenumber domain. The basis steerable filter amplitudes are proportional to cos^2(theta). Three basis orientations are used for 2D, and six orientations are used for 3D images. Radial filters are used to partition the data into 1-octave bands, with a cosine taper. Images are subsampled for each pyramid level which greatly reduces processing effort for lower wavenumbers.
Directionally-filtered basis images are used to estimate local orientation and dimensionality. Preprocessing, which includes averaging in space and scale domains, is applied for these estimates. Steering weights can be calculated and applied. Scaling and thresholding can also be applied, based on a local dimensionality attribute. For 3D images, processing can be applied to enhance either locally-linear or locally-planar features.
The number of pyramid levels to use is calculated from the size of the input image, assuming a minimum basis image dimension of 9 samples in x,y or z. The input image is padded before it is transformed to wavenumber domain, where the filters are applied. The main reason for this padding is to avoid losing first or last samples when subsampling. We like the number of samples to be such that, for number of samples n in x, y, or z, (n-1)/2+1 will always yield an integer.
The format of the steerable pyramid is a 4-dimensional array for 2D, and a 5-dimensional array for 3D. The first dimension is level number and second dimension is basis filter orientation. Below these are either 2D or 3D arrays. To illustrate for 2D:
[0][0][0][0] to [0][0][n2][n1] = level 0, theta 0
[0][1][0][0] to [0][1][n2][n1] = level 0, theta PI/3
[0][2][0][0] to [0][2][n2][n1] = level 0, theta 2*PI/3
[1][0][0][0] to [1][0][(n2-1)/2+1][(n1-1)/2+1] = level 1, theta 0
[1][1][0][0] to [1][1][(n2-1)/2+1][(n1-1)/2+1] = level 1, theta PI/3
[1][2][0][0] to [1][2][(n2-1)/2+1][(n1-1)/2+1] = level 1, theta 2*PI/3
...
[NLEVEL-1][2][0][0] to [NLEVEL-1][2][(n2-1)/(2^(NLEVEL-1))+1][(n1-1)/(2^(NLEVEL-1))+1] = level N-1, theta 2*PI/3
[NLEVEL][0][0][0] to [NLEVEL][0][(n2-1)/(2^NLEVEL)+1][(n1-1)/(2^NLEVEL)+1] = residual low-wavenumber image
The 3D steerable pyramid array is the same except that it is arrays of arrays of 3D, rather than 2D arrays.
Constructor and Description |
---|
SteerablePyramid()
Construct a steerable pyramid with default cutoff wavenumbers
used in the radial low-pass filters.
|
SteerablePyramid(double ka,
double kb)
Construct a steerable pyramid with specified cutoff wavenumbers
used in the radial low-pass filters.
|
Modifier and Type | Method and Description |
---|---|
float[][][][][] |
estimateAttributes(boolean forlinear,
double sigma,
float[][][][][] spyr)
Estimation of local orientation and linearity attributes in 3D.
|
float[][][][] |
estimateAttributes(double sigma,
float[][][][] spyr)
Estimation of local orientation and linearity attributes in 2D.
|
float[][][][] |
makePyramid(float[][] x)
Creates a steerable pyramid representation of an input 2D image.
|
float[][][][][] |
makePyramid(float[][][] x)
Creates a steerable pyramid representation of an input 3D image.
|
void |
steerScale(boolean forlinear,
int linpowr,
float k,
float thresh,
float[][][][][] attr,
float[][][][][] spyr)
Applies steering weights and scaling or thresholding based on linearity
attribute to the basis images in the input 3D steerable pyramid array.
|
void |
steerScale(int linpowr,
float k,
float thresh,
float[][][][] attr,
float[][][][] spyr)
Applies steering weights and scaling or thresholding based on linearity
attribute to the basis images in the input 2D steerable pyramid.
|
float[][] |
sumPyramid(boolean keeplow,
float[][][][] spyr)
Sums all basis images from an input 2D steerable pyramid to create a
filtered output image.
|
float[][][] |
sumPyramid(boolean keeplow,
float[][][][][] spyr)
Sums all basis images from an input 3D steerable pyramid to create a
filtered output image.
|
public SteerablePyramid()
public SteerablePyramid(double ka, double kb)
ka
- wavenumber at start of taper. Amp(ka)=1.kb
- wavenumber at end of taper. Amp(ka)=0.public float[][][][] makePyramid(float[][] x)
x
- input 2D image.public float[][][][][] makePyramid(float[][][] x)
x
- input 3D image.public float[][] sumPyramid(boolean keeplow, float[][][][] spyr)
keeplow
- if true:keep low-wavenumber energy, if false: zero it.spyr
- input 2D steerable pyramid.public float[][][] sumPyramid(boolean keeplow, float[][][][][] spyr)
keeplow
- if true:keep low-wavenumber energy, if false: zero it.spyr
- input 3D steerable pyramid.public float[][][][] estimateAttributes(double sigma, float[][][][] spyr)
The format of the output attributes is a 4-dimensional array. The first dimension is level number and second dimension is type of attribute. Below these are 2D arrays:
[0][0][0][0] to [0][0][n2][n1] = level 0, theta attribute (radians)
[0][1][0][0] to [0][1][n2][n1] = level 0, linearity attribute
[1][0][0][0] to [1][0][(n2-1)/2+1][(n1-1)/2+1] = level 1, theta attribute (radians)
[1][1][0][0] to [1][1][(n2-1)/2+1][(n1-1)/2+1] = level 1, linearity attribute
...
[NLEVEL-1][1][0][0] to [NLEVEL-1][1][(n2-1)/(2^(NLEVEL-1))+1][(n1-1)/(2^(NLEVEL-1))+1] = level N-1, linearity attribute
sigma
- half-width of 2D Gaussian smoothing filter.spyr
- input 2D steerable pyramid.public float[][][][][] estimateAttributes(boolean forlinear, double sigma, float[][][][][] spyr)
In 3D we have a choice of filtering to enhance locally-planar or locally-linear image features. There is a parameter in this method to select one of these choices. If enhancement of planar features is selected the output orientation attributes define the normal to locally-planar features, and the dimensionality attribute is a measure of planarity. If enhancement of locally-linear features is selected the output orientation attributes define the orientation of a locally-linear feature, and the dimensionality attribute is a local measure of linearity.
The format of the output attributes is a 5-dimensional array. The first dimension is level number and second dimension is type of attribute. Below these are 3D arrays:
[0][0][0][0][0] to [0][0][n3][n2][n1] = level 0, direction cosine a
[0][0][0][0][0] to [0][0][n3][n2][n1] = level 0, direction cosine b
[0][0][0][0][0] to [0][0][n3][n2][n1] = level 0, direction cosine c
[0][1][0][0][0] to [0][1][n3][n2][n1] = level 0, dimensionality attribute
These are repeated for all levels, subsampled for every successive level.
forlinear
- true: apply to enhance locally linear, false: apply for
planar.sigma
- half-width of 3D Gaussian smoothing filter.spyr
- input 3D steerable pyramid.public void steerScale(boolean forlinear, int linpowr, float k, float thresh, float[][][][][] attr, float[][][][][] spyr)
forlinear
- true: apply to enhance locally linear, false: apply
for planar.linpowr
- linearity power and scaling type switch.k
- sigmoidal thresholding steepness.thresh
- threshold.attr
- input array containing direction cosines and dimensionality.spyr
- input/output 3D steerable pyramid.public void steerScale(int linpowr, float k, float thresh, float[][][][] attr, float[][][][] spyr)
linpowr
- linearity power and scaling type switch.k
- sigmoidal thresholding steepness.thresh
- threshold.attr
- input array containing local orientation and linearity.spyr
- input/output 2D steerable pyramid.