This module provides a class for principal components analysis (PCA).
PCA is an orthonormal, linear transform (i.e., a rotation) that maps the data to a new coordinate system such that the maximal variability of the data lies on the first coordinate (or the first principal component), the second greatest variability is projected onto the second coordinate, and so on. The resulting data has unit covariance (i.e., it is decorrelated). This technique can be used to reduce the dimensionality of the data.
More specifically, the data is projected onto the eigenvectors of the covariance matrix.
Compute the SVD PCA of an array-like thing over axis.
Parameters : | data : ndarray-like (np.float)
axis : int, optional
mask : ndarray-like (np.bool), optional
ncomp : {None, int}, optional
standardize : bool, optional
design_keep : None or ndarray, optional
design_resid : str or None or ndarray, optional
tol_ratio : float, optional
|
---|---|
Returns : | results : dict
|
Notes
See pca_image.m from fmristat for Keith Worsley’s code on which some of this is based.
See: http://en.wikipedia.org/wiki/Principal_component_analysis for some inspiration for naming - particularly ‘basis_vectors’ and ‘basis_projections’
Examples
>>> arr = np.random.normal(size=(17, 10, 12, 14))
>>> msk = np.all(arr > -2, axis=0)
>>> res = pca(arr, mask=msk, ncomp=9)
Basis vectors are columns. There is one column for each component. The number of components is the calculated rank of the data matrix after applying the various projections listed in the parameters. In this case we are only removing the mean, so the number of components is one less than the axis over which we do the PCA (here axis=0 by default).
>>> res['basis_vectors'].shape
(17, 16)
Basis projections are arrays with components in the dimension over which we have done the PCA (axis=0 by default). Because we set ncomp above, we only retain ncomp components.
>>> res['basis_projections'].shape
(9, 10, 12, 14)
Compute the PCA of an image over a specified axis
Parameters : | img : Image
axis : str or int, optional
mask : Image, optional
ncomp : {None, int}, optional
standardize : bool, optional
design_keep : None or ndarray, optional
design_resid : str or None or ndarray, optional
tol_ratio : float, optional
|
---|---|
Returns : | results : dict
|
Examples
>>> from nipy.testing import funcfile
>>> from nipy import load_image
>>> func_img = load_image(funcfile)
Time is the fourth axis
>>> func_img.coordmap.function_range
CoordinateSystem(coord_names=('aligned-x=L->R', 'aligned-y=P->A', 'aligned-z=I->S', 't'), name='aligned', coord_dtype=float64)
>>> func_img.shape
(17, 21, 3, 20)
Calculate the PCA over time, by default
>>> res = pca_image(func_img)
>>> res['basis_projections'].coordmap.function_range
CoordinateSystem(coord_names=('aligned-x=L->R', 'aligned-y=P->A', 'aligned-z=I->S', 'PCA components'), name='aligned', coord_dtype=float64)
The number of components is one less than the number of time points
>>> res['basis_projections'].shape
(17, 21, 3, 19)