Inheritance diagram for nipy.algorithms.clustering.imm:
Infinite mixture model : A generalization of Bayesian mixture models with an unspecified number of classes
Bases: nipy.algorithms.clustering.bgmm.BGMM
The class implements Infinite Gaussian Mixture model or Dirichlet Proces Mixture Model. This simply a generalization of Bayesian Gaussian Mixture Models with an unknown number of classes.
Parameters: | alpha: float, optional, :
dim: int, optional, :
Note: use the function set_priors() to set adapted priors : |
---|
returns the averaged log-likelihood of the mode for the dataset x
Parameters: | x: array of shape (n_samples,self.dim) :
tiny = 1.e-15: a small constant to avoid numerical singularities : |
---|
Evaluate the Bayes Factor of the current model using Chib’s method
Parameters: | x: array of shape (nb_samples,dim) :
z: array of shape (nb_samples), type = np.int :
nperm=0: int :
verbose=0: verbosity mode : |
---|---|
Returns: | bf (float) the computed evidence (Bayes factor) : |
Notes
See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995
Computation of bic approximation of evidence
Parameters: | like, array of shape (n_samples, self.k) :
tiny=1.e-15, a small constant to avoid numerical singularities : |
---|---|
Returns: | the bic value, float : |
Checking the shape of sifferent matrices involved in the model
essentially check that x.shape[1]==self.dim
x is returned with possibly reshaping
Compute the probability of the current parameters of self given x and z
Parameters: | x: array of shape (nb_samples, dim), :
z: array of shape (nb_samples), type = np.int, :
perm: array ok shape(nperm, self.k),typ=np.int, optional :
|
---|
This is a step in the sampling procedure that uses internal corss_validation
Parameters: | x: array of shape(n_samples, dim), :
z: array of shape(n_samples), :
plike: array of shape(n_samples), :
kfold: int, or array of shape(n_samples), optional, :
|
---|---|
Returns: | like: array od shape(n_samples), :
|
Estimation of the model given a dataset x
Parameters: | x array of shape (n_samples,dim) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
verbose=0: verbosity mode : |
---|---|
Returns: | bic : an asymptotic approximation of model evidence |
See bayes_factor(self, x, z, nperm=0, verbose=0)
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters: | x, array of shape (nb_samples,self.dim) :
nocheck: boolean, optional, :
|
---|
Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters: | x array of shape (n_samples,dim) :
|
---|
initialize z using a k-means algorithm, then upate the parameters
Parameters: | x: array of shape (nb_samples,self.dim) :
|
---|
Estimation of self given x
Parameters: | x array of shape (n_samples,dim) :
z = None: array of shape (n_samples) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
ninit=1: number of initialization performed :
verbose=0: verbosity mode : |
---|---|
Returns: | the best model is returned : |
return the likelihood of the model for the data x the values are weighted by the components weights
Parameters: | x: array of shape (n_samples, self.dim), :
plike: array os shape (n_samples), optional,x :
|
---|---|
Returns: | like, array of shape(nbitem,self.k) : component-wise likelihood : |
Computes the likelihood of x under the prior
Parameters: | x, array of shape (self.n_samples,self.dim) : |
---|---|
Returns: | w, the likelihood of x under the prior model (unweighted) : |
return the MAP labelling of x
Parameters: | x array of shape (n_samples,dim) :
like=None array of shape(n_samples,self.k) :
|
---|---|
Returns: | z: array of shape(n_samples): the resulting MAP labelling :
|
Returns the likelihood of the mixture for x
Parameters: | x: array of shape (n_samples,self.dim) :
|
---|
Set manually the weights, means and precision of the model
Parameters: | means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
weights: array of shape (self.k) : |
---|
compute the population, i.e. the statistics of allocation
Parameters: | z array of shape (nb_samples), type = np.int :
|
---|---|
Returns: | hist : array shape (self.k) count variable |
Compute the probability of the current parameters of self given the priors
Reduce the assignments by removing empty clusters and update self.k
Parameters: | z: array of shape(n), :
|
---|---|
Returns: | z: the remapped values : |
sample the indicator and parameters
Parameters: | x: array of shape (n_samples, self.dim) :
niter: int, :
sampling_points: array of shape(nbpoints, self.dim), optional :
kfold: int or array, optional, :
verbose=0: verbosity mode : |
---|---|
Returns: | likelihood: array of shape(nbpoints) :
|
sample the indicator and parameters the average values for weights,means, precisions are returned
Parameters: | x = array of shape (nb_samples,dim) :
niter=1: number of iterations : |
---|---|
Returns: | weights: array of shape (self.k) : means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
|
Notes
All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).
fix: implement a permutation procedure for components identification
Sample the indicator from the likelihood
Parameters: | like: array of shape (nbitem,self.k) :
|
---|---|
Returns: | z: array of shape(nbitem): a draw of the membership variable : |
Notes
The behaviour is different from standard bgmm in that z can take arbitrary values
Set the null and prior densities as constant (assuming a compact domain)
Parameters: | prior_dens: float, optional :
|
---|
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters: | x, array of shape (n_samples,self.dim) :
|
---|
Function to plot a GMM, still in progress Currently, works only in 1D and 2D
Parameters: | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
|
---|
Function to plot a GMM – Currently, works only in 1D
Parameters: | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
mpaxes: axes handle to make the figure, optional, :
|
---|
This is a step in the sampling procedure
that uses internal corss_validation
Parameters: | x: array of shape(n_samples, dim), :
z: array of shape(n_samples), :
plike: array of shape(n_samples), :
|
---|---|
Returns: | like: array od shape(n_samples), :
|
Returns the log-likelihood of the mixture for x
Parameters: | x array of shape (n_samples,self.dim) :
|
---|---|
Returns: | ll: array of shape(n_samples) :
|
Idem initialize_and_estimate
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters: | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns: | like, array of shape(n_samples,self.k) :
|
Notes
Hopefully faster
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters: | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns: | like, array of shape(n_samples,self.k) :
|
Update function (draw a sample of the IMM parameters)
Parameters: | x array of shape (n_samples,self.dim) :
z array of shape (n_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the mean
Parameters: | x: array of shape (nb_samples,self.dim) :
z: array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the precisions
Parameters: | x array of shape (nb_samples,self.dim) :
z array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, resmaple the weights parameter
Parameters: | z array of shape (n_samples), type = np.int :
|
---|
Bases: nipy.algorithms.clustering.imm.IMM
Particular IMM with an additional null class. The data is supplied together with a sample-related probability of being under the null.
Parameters: | alpha: float, optional, :
dim: int, optional, :
Note: use the function set_priors() to set adapted priors : |
---|
returns the averaged log-likelihood of the mode for the dataset x
Parameters: | x: array of shape (n_samples,self.dim) :
tiny = 1.e-15: a small constant to avoid numerical singularities : |
---|
Evaluate the Bayes Factor of the current model using Chib’s method
Parameters: | x: array of shape (nb_samples,dim) :
z: array of shape (nb_samples), type = np.int :
nperm=0: int :
verbose=0: verbosity mode : |
---|---|
Returns: | bf (float) the computed evidence (Bayes factor) : |
Notes
See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995
Computation of bic approximation of evidence
Parameters: | like, array of shape (n_samples, self.k) :
tiny=1.e-15, a small constant to avoid numerical singularities : |
---|---|
Returns: | the bic value, float : |
Checking the shape of sifferent matrices involved in the model
essentially check that x.shape[1]==self.dim
x is returned with possibly reshaping
Compute the probability of the current parameters of self given x and z
Parameters: | x: array of shape (nb_samples, dim), :
z: array of shape (nb_samples), type = np.int, :
perm: array ok shape(nperm, self.k),typ=np.int, optional :
|
---|
This is a step in the sampling procedure that uses internal corss_validation
Parameters: | x: array of shape(n_samples, dim), :
z: array of shape(n_samples), :
plike: array of shape(n_samples), :
kfold: int, optional, or array :
null_class_proba: array of shape(n_samples), :
|
---|---|
Returns: | like: array od shape(n_samples), :
z: array of shape(n_samples), :
|
Notes
When kfold is an array, there is an internal reshuffling to randomize the order of updates
Estimation of the model given a dataset x
Parameters: | x array of shape (n_samples,dim) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
verbose=0: verbosity mode : |
---|---|
Returns: | bic : an asymptotic approximation of model evidence |
See bayes_factor(self, x, z, nperm=0, verbose=0)
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters: | x, array of shape (nb_samples,self.dim) :
nocheck: boolean, optional, :
|
---|
Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters: | x array of shape (n_samples,dim) :
|
---|
initialize z using a k-means algorithm, then upate the parameters
Parameters: | x: array of shape (nb_samples,self.dim) :
|
---|
Estimation of self given x
Parameters: | x array of shape (n_samples,dim) :
z = None: array of shape (n_samples) :
niter=100: maximal number of iterations in the estimation process : delta = 1.e-4: increment of data likelihood at which :
ninit=1: number of initialization performed :
verbose=0: verbosity mode : |
---|---|
Returns: | the best model is returned : |
return the likelihood of the model for the data x the values are weighted by the components weights
Parameters: | x: array of shape (n_samples, self.dim), :
plike: array os shape (n_samples), optional,x :
|
---|---|
Returns: | like, array of shape(nbitem,self.k) : component-wise likelihood : |
Computes the likelihood of x under the prior
Parameters: | x, array of shape (self.n_samples,self.dim) : |
---|---|
Returns: | w, the likelihood of x under the prior model (unweighted) : |
return the MAP labelling of x
Parameters: | x array of shape (n_samples,dim) :
like=None array of shape(n_samples,self.k) :
|
---|---|
Returns: | z: array of shape(n_samples): the resulting MAP labelling :
|
Returns the likelihood of the mixture for x
Parameters: | x: array of shape (n_samples,self.dim) :
|
---|
Set manually the weights, means and precision of the model
Parameters: | means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
weights: array of shape (self.k) : |
---|
compute the population, i.e. the statistics of allocation
Parameters: | z array of shape (nb_samples), type = np.int :
|
---|---|
Returns: | hist : array shape (self.k) count variable |
Compute the probability of the current parameters of self given the priors
Reduce the assignments by removing empty clusters and update self.k
Parameters: | z: array of shape(n), :
|
---|---|
Returns: | z: the remapped values : |
sample the indicator and parameters
Parameters: | x: array of shape (n_samples, self.dim), :
null_class_proba: array of shape(n_samples), :
niter: int, :
sampling_points: array of shape(nbpoints, self.dim), optional :
kfold: int, optional, :
co_clustering: bool, optional :
verbose=0: verbosity mode : |
---|---|
Returns: | likelihood: array of shape(nbpoints) :
pproba: array of shape(n_samples), :
coclust: only if co_clustering==True, :
|
sample the indicator and parameters the average values for weights,means, precisions are returned
Parameters: | x = array of shape (nb_samples,dim) :
niter=1: number of iterations : |
---|---|
Returns: | weights: array of shape (self.k) : means: array of shape (self.k,self.dim) : precisions: array of shape (self.k,self.dim,self.dim) :
|
Notes
All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).
fix: implement a permutation procedure for components identification
sample the indicator from the likelihood
Parameters: | like: array of shape (nbitem,self.k) :
null_class_proba: array of shape(n_samples), :
|
---|---|
Returns: | z: array of shape(nbitem): a draw of the membership variable : |
Notes
Here z=-1 encodes for the null class
Set the null and prior densities as constant (over a supposedly compact domain)
Parameters: | null_dens: float, optional :
prior_dens: float, optional :
|
---|
Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)
Parameters: | x, array of shape (n_samples,self.dim) :
|
---|
Function to plot a GMM, still in progress Currently, works only in 1D and 2D
Parameters: | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
|
---|
Function to plot a GMM – Currently, works only in 1D
Parameters: | x: array of shape(n_samples, dim) :
gd: GridDescriptor instance : density: array os shape(prod(gd.n_bins)) :
mpaxes: axes handle to make the figure, optional, :
|
---|
One step in the sampling procedure (one data sweep)
Parameters: | x: array of shape(n_samples, dim), :
z: array of shape(n_samples), :
plike: array of shape(n_samples), :
null_class_proba: array of shape(n_samples), :
|
---|---|
Returns: | like: array od shape(n_samples), :
|
Returns the log-likelihood of the mixture for x
Parameters: | x array of shape (n_samples,self.dim) :
|
---|---|
Returns: | ll: array of shape(n_samples) :
|
Idem initialize_and_estimate
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters: | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns: | like, array of shape(n_samples,self.k) :
|
Notes
Hopefully faster
return the likelihood of each data for each component the values are not weighted by the component weights
Parameters: | x: array of shape (n_samples,self.dim) :
|
---|---|
Returns: | like, array of shape(n_samples,self.k) :
|
Update function (draw a sample of the IMM parameters)
Parameters: | x array of shape (n_samples,self.dim) :
z array of shape (n_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the mean
Parameters: | x: array of shape (nb_samples,self.dim) :
z: array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, and the corresponding data x, resample the precisions
Parameters: | x array of shape (nb_samples,self.dim) :
z array of shape (nb_samples), type = np.int :
|
---|
Given the allocation vector z, resmaple the weights parameter
Parameters: | z array of shape (n_samples), type = np.int :
|
---|
return a sparse co-labelling matrix given the label vector z
Parameters: | z: array of shape(n_samples), :
kmax: int, optional, :
|
---|---|
Returns: | colabel: a sparse coo_matrix, :
|
Illustrative example of the behaviour of imm