algorithms.clustering.bgmm

Module: algorithms.clustering.bgmm

Inheritance diagram for nipy.algorithms.clustering.bgmm:

Inheritance diagram of nipy.algorithms.clustering.bgmm

Bayesian Gaussian Mixture Model Classes: contains the basic fields and methods of Bayesian GMMs the high level functions are/should be binded in C

The base class BGMM relies on an implementation that perfoms Gibbs sampling

A derived class VBGMM uses Variational Bayes inference instead

A third class is introduces to take advnatge of the old C-bindings, but it is limited to diagonal covariance models

Author : Bertrand Thirion, 2008-2011

Classes

BGMM

class nipy.algorithms.clustering.bgmm.BGMM(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)

Bases: nipy.algorithms.clustering.gmm.GMM

This class implements Bayesian GMMs

this class contains the follwing fields k: int,

the number of components in the mixture
dim: int,
the dimension of the data
means: array of shape (k, dim)
all the means of the components
precisions: array of shape (k, dim, dim)
the precisions of the componenets
weights: array of shape (k):
weights of the mixture
shrinkage: array of shape (k):
scaling factor of the posterior precisions on the mean
dof: array of shape (k)
the degrees of freedom of the components
prior_means: array of shape (k, dim):
the prior on the components means
prior_scale: array of shape (k, dim):
the prior on the components precisions
prior_dof: array of shape (k):
the prior on the dof (should be at least equal to dim)
prior_shrinkage: array of shape (k):
scaling factor of the prior precisions on the mean
prior_weights: array of shape (k)
the prior on the components weights
shrinkage: array of shape (k):
scaling factor of the posterior precisions on the mean

dof : array of shape (k): the posterior dofs

Methods

average_log_like(x[, tiny]) returns the averaged log-likelihood of the mode for the dataset x
bayes_factor(x, z[, nperm, verbose]) Evaluate the Bayes Factor of the current model using Chib’s method
bic(like[, tiny]) Computation of bic approximation of evidence
check() Checking the shape of sifferent matrices involved in the model
check_x(x) essentially check that x.shape[1]==self.dim
conditional_posterior_proba(x, z[, perm]) Compute the probability of the current parameters of self
estimate(x[, niter, delta, verbose]) Estimation of the model given a dataset x
evidence(x, z[, nperm, verbose]) See bayes_factor(self, x, z, nperm=0, verbose=0)
guess_priors(x[, nocheck]) Set the priors in order of having them weakly uninformative
guess_regularizing(x[, bcheck]) Set the regularizing priors as weakly informative
initialize(x) initialize z using a k-means algorithm, then upate the parameters
initialize_and_estimate(x[, z, niter, …]) Estimation of self given x
likelihood(x) return the likelihood of the model for the data x
map_label(x[, like]) return the MAP labelling of x
mixture_likelihood(x) Returns the likelihood of the mixture for x
plugin(means, precisions, weights) Set manually the weights, means and precision of the model
pop(z) compute the population, i.e. the statistics of allocation
probability_under_prior() Compute the probability of the current parameters of self
sample(x[, niter, mem, verbose]) sample the indicator and parameters
sample_and_average(x[, niter, verbose]) sample the indicator and parameters
sample_indicator(like) sample the indicator from the likelihood
set_priors(prior_means, prior_weights, …) Set the prior of the BGMM
show(x, gd[, density, axes]) Function to plot a GMM, still in progress
show_components(x, gd[, density, mpaxes]) Function to plot a GMM – Currently, works only in 1D
test(x[, tiny]) Returns the log-likelihood of the mixture for x
train(x[, z, niter, delta, ninit, verbose]) Idem initialize_and_estimate
unweighted_likelihood(x) return the likelihood of each data for each component
unweighted_likelihood_(x) return the likelihood of each data for each component
update(x, z) update function (draw a sample of the GMM parameters)
update_means(x, z) Given the allocation vector z,
update_precisions(x, z) Given the allocation vector z,
update_weights(z) Given the allocation vector z, resample the weights parameter
__init__(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)

Initialize the structure with the dimensions of the problem Eventually provide different terms

average_log_like(x, tiny=1e-15)

returns the averaged log-likelihood of the mode for the dataset x

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

tiny = 1.e-15: a small constant to avoid numerical singularities :

bayes_factor(x, z, nperm=0, verbose=0)

Evaluate the Bayes Factor of the current model using Chib’s method

Parameters:

x: array of shape (nb_samples,dim) :

the data from which bic is computed

z: array of shape (nb_samples), type = np.int :

the corresponding classification

nperm=0: int :

the number of permutations to sample to model the label switching issue in the computation of the Bayes Factor By default, exhaustive permutations are used

verbose=0: verbosity mode :

Returns:

bf (float) the computed evidence (Bayes factor) :

Notes

See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995

bic(like, tiny=1e-15)

Computation of bic approximation of evidence

Parameters:

like, array of shape (n_samples, self.k) :

component-wise likelihood

tiny=1.e-15, a small constant to avoid numerical singularities :

Returns:

the bic value, float :

check()

Checking the shape of sifferent matrices involved in the model

check_x(x)

essentially check that x.shape[1]==self.dim

x is returned with possibly reshaping

conditional_posterior_proba(x, z, perm=None)

Compute the probability of the current parameters of self given x and z

Parameters:

x: array of shape (nb_samples, dim), :

the data from which bic is computed

z: array of shape (nb_samples), type = np.int, :

the corresponding classification

perm: array ok shape(nperm, self.k),typ=np.int, optional :

all permutation of z under which things will be recomputed By default, no permutation is performed

estimate(x, niter=100, delta=0.0001, verbose=0)

Estimation of the model given a dataset x

Parameters:

x array of shape (n_samples,dim) :

the data from which the model is estimated

niter=100: maximal number of iterations in the estimation process :

delta = 1.e-4: increment of data likelihood at which :

convergence is declared

verbose=0: verbosity mode :

Returns:

bic : an asymptotic approximation of model evidence

evidence(x, z, nperm=0, verbose=0)

See bayes_factor(self, x, z, nperm=0, verbose=0)

guess_priors(x, nocheck=0)

Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:

x, array of shape (nb_samples,self.dim) :

the data used in the estimation process

nocheck: boolean, optional, :

if nocheck==True, check is skipped

guess_regularizing(x, bcheck=1)

Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:

x array of shape (n_samples,dim) :

the data used in the estimation process

initialize(x)

initialize z using a k-means algorithm, then upate the parameters

Parameters:

x: array of shape (nb_samples,self.dim) :

the data used in the estimation process

initialize_and_estimate(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Estimation of self given x

Parameters:

x array of shape (n_samples,dim) :

the data from which the model is estimated

z = None: array of shape (n_samples) :

a prior labelling of the data to initialize the computation

niter=100: maximal number of iterations in the estimation process :

delta = 1.e-4: increment of data likelihood at which :

convergence is declared

ninit=1: number of initialization performed :

to reach a good solution

verbose=0: verbosity mode :

Returns:

the best model is returned :

likelihood(x)

return the likelihood of the model for the data x the values are weighted by the components weights

Parameters:

x array of shape (n_samples,self.dim) :

the data used in the estimation process

Returns:

like, array of shape(n_samples,self.k) :

component-wise likelihood

map_label(x, like=None)

return the MAP labelling of x

Parameters:

x array of shape (n_samples,dim) :

the data under study

like=None array of shape(n_samples,self.k) :

component-wise likelihood if like==None, it is recomputed

Returns:

z: array of shape(n_samples): the resulting MAP labelling :

of the rows of x

mixture_likelihood(x)

Returns the likelihood of the mixture for x

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

plugin(means, precisions, weights)

Set manually the weights, means and precision of the model

Parameters:

means: array of shape (self.k,self.dim) :

precisions: array of shape (self.k,self.dim,self.dim) :

or (self.k, self.dim)

weights: array of shape (self.k) :

pop(z)

compute the population, i.e. the statistics of allocation

Parameters:

z array of shape (nb_samples), type = np.int :

the allocation variable

Returns:

hist : array shape (self.k) count variable

probability_under_prior()

Compute the probability of the current parameters of self given the priors

sample(x, niter=1, mem=0, verbose=0)

sample the indicator and parameters

Parameters:

x array of shape (nb_samples,self.dim) :

the data used in the estimation process

niter=1 : the number of iterations to perform

mem=0: if mem, the best values of the parameters are computed :

verbose=0: verbosity mode :

Returns:

best_weights: array of shape (self.k) :

best_means: array of shape (self.k, self.dim) :

best_precisions: array of shape (self.k, self.dim, self.dim) :

possibleZ: array of shape (nb_samples, niter) :

the z that give the highest posterior to the data is returned first

sample_and_average(x, niter=1, verbose=0)

sample the indicator and parameters the average values for weights,means, precisions are returned

Parameters:

x = array of shape (nb_samples,dim) :

the data from which bic is computed

niter=1: number of iterations :

Returns:

weights: array of shape (self.k) :

means: array of shape (self.k,self.dim) :

precisions: array of shape (self.k,self.dim,self.dim) :

or (self.k, self.dim) these are the average parameters across samplings

Notes

All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).

fix: implement a permutation procedure for components identification

sample_indicator(like)

sample the indicator from the likelihood

Parameters:

like: array of shape (nb_samples,self.k) :

component-wise likelihood

Returns:

z: array of shape(nb_samples): a draw of the membership variable :

set_priors(prior_means, prior_weights, prior_scale, prior_dof, prior_shrinkage)

Set the prior of the BGMM

Parameters:

prior_means: array of shape (self.k,self.dim) :

prior_weights: array of shape (self.k) :

prior_scale: array of shape (self.k,self.dim,self.dim) :

prior_dof: array of shape (self.k) :

prior_shrinkage: array of shape (self.k) :

show(x, gd, density=None, axes=None)

Function to plot a GMM, still in progress Currently, works only in 1D and 2D

Parameters:

x: array of shape(n_samples, dim) :

the data under study

gd: GridDescriptor instance :

density: array os shape(prod(gd.n_bins)) :

density of the model one the discrete grid implied by gd by default, this is recomputed

show_components(x, gd, density=None, mpaxes=None)

Function to plot a GMM – Currently, works only in 1D

Parameters:

x: array of shape(n_samples, dim) :

the data under study

gd: GridDescriptor instance :

density: array os shape(prod(gd.n_bins)) :

density of the model one the discrete grid implied by gd by default, this is recomputed

mpaxes: axes handle to make the figure, optional, :

if None, a new figure is created

test(x, tiny=1e-15)

Returns the log-likelihood of the mixture for x

Parameters:

x array of shape (n_samples,self.dim) :

the data used in the estimation process

Returns:

ll: array of shape(n_samples) :

the log-likelihood of the rows of x

train(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Idem initialize_and_estimate

unweighted_likelihood(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

Returns:

like, array of shape(n_samples,self.k) :

unweighted component-wise likelihood

Notes

Hopefully faster

unweighted_likelihood_(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

Returns:

like, array of shape(n_samples,self.k) :

unweighted component-wise likelihood

update(x, z)

update function (draw a sample of the GMM parameters)

Parameters:

x array of shape (nb_samples,self.dim) :

the data used in the estimation process

z array of shape (nb_samples), type = np.int :

the corresponding classification

update_means(x, z)

Given the allocation vector z, and the corresponding data x, resample the mean

Parameters:

x: array of shape (nb_samples,self.dim) :

the data used in the estimation process

z: array of shape (nb_samples), type = np.int :

the corresponding classification

update_precisions(x, z)

Given the allocation vector z, and the corresponding data x, resample the precisions

Parameters:

x array of shape (nb_samples,self.dim) :

the data used in the estimation process

z array of shape (nb_samples), type = np.int :

the corresponding classification

update_weights(z)

Given the allocation vector z, resample the weights parameter

Parameters:

z array of shape (nb_samples), type = np.int :

the allocation variable

VBGMM

class nipy.algorithms.clustering.bgmm.VBGMM(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)

Bases: nipy.algorithms.clustering.bgmm.BGMM

Subclass of Bayesian GMMs (BGMM) that implements Variational Bayes estimation of the parameters

Methods

average_log_like(x[, tiny]) returns the averaged log-likelihood of the mode for the dataset x
bayes_factor(x, z[, nperm, verbose]) Evaluate the Bayes Factor of the current model using Chib’s method
bic(like[, tiny]) Computation of bic approximation of evidence
check() Checking the shape of sifferent matrices involved in the model
check_x(x) essentially check that x.shape[1]==self.dim
conditional_posterior_proba(x, z[, perm]) Compute the probability of the current parameters of self
estimate(x[, niter, delta, verbose]) estimation of self given x
evidence(x[, like, verbose]) computation of evidence bound aka free energy
guess_priors(x[, nocheck]) Set the priors in order of having them weakly uninformative
guess_regularizing(x[, bcheck]) Set the regularizing priors as weakly informative
initialize(x) initialize z using a k-means algorithm, then upate the parameters
initialize_and_estimate(x[, z, niter, …]) Estimation of self given x
likelihood(x) return the likelihood of the model for the data x
map_label(x[, like]) return the MAP labelling of x
mixture_likelihood(x) Returns the likelihood of the mixture for x
plugin(means, precisions, weights) Set manually the weights, means and precision of the model
pop(like[, tiny]) compute the population, i.e. the statistics of allocation
probability_under_prior() Compute the probability of the current parameters of self
sample(x[, niter, mem, verbose]) sample the indicator and parameters
sample_and_average(x[, niter, verbose]) sample the indicator and parameters
sample_indicator(like) sample the indicator from the likelihood
set_priors(prior_means, prior_weights, …) Set the prior of the BGMM
show(x, gd[, density, axes]) Function to plot a GMM, still in progress
show_components(x, gd[, density, mpaxes]) Function to plot a GMM – Currently, works only in 1D
test(x[, tiny]) Returns the log-likelihood of the mixture for x
train(x[, z, niter, delta, ninit, verbose]) Idem initialize_and_estimate
unweighted_likelihood(x) return the likelihood of each data for each component
unweighted_likelihood_(x) return the likelihood of each data for each component
update(x, z) update function (draw a sample of the GMM parameters)
update_means(x, z) Given the allocation vector z,
update_precisions(x, z) Given the allocation vector z,
update_weights(z) Given the allocation vector z, resample the weights parameter
__init__(k=1, dim=1, means=None, precisions=None, weights=None, shrinkage=None, dof=None)
average_log_like(x, tiny=1e-15)

returns the averaged log-likelihood of the mode for the dataset x

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

tiny = 1.e-15: a small constant to avoid numerical singularities :

bayes_factor(x, z, nperm=0, verbose=0)

Evaluate the Bayes Factor of the current model using Chib’s method

Parameters:

x: array of shape (nb_samples,dim) :

the data from which bic is computed

z: array of shape (nb_samples), type = np.int :

the corresponding classification

nperm=0: int :

the number of permutations to sample to model the label switching issue in the computation of the Bayes Factor By default, exhaustive permutations are used

verbose=0: verbosity mode :

Returns:

bf (float) the computed evidence (Bayes factor) :

Notes

See: Marginal Likelihood from the Gibbs Output Journal article by Siddhartha Chib; Journal of the American Statistical Association, Vol. 90, 1995

bic(like, tiny=1e-15)

Computation of bic approximation of evidence

Parameters:

like, array of shape (n_samples, self.k) :

component-wise likelihood

tiny=1.e-15, a small constant to avoid numerical singularities :

Returns:

the bic value, float :

check()

Checking the shape of sifferent matrices involved in the model

check_x(x)

essentially check that x.shape[1]==self.dim

x is returned with possibly reshaping

conditional_posterior_proba(x, z, perm=None)

Compute the probability of the current parameters of self given x and z

Parameters:

x: array of shape (nb_samples, dim), :

the data from which bic is computed

z: array of shape (nb_samples), type = np.int, :

the corresponding classification

perm: array ok shape(nperm, self.k),typ=np.int, optional :

all permutation of z under which things will be recomputed By default, no permutation is performed

estimate(x, niter=100, delta=0.0001, verbose=0)

estimation of self given x

Parameters:

x array of shape (nb_samples,dim) :

the data from which the model is estimated

z = None: array of shape (nb_samples) :

a prior labelling of the data to initialize the computation

niter=100: maximal number of iterations in the estimation process :

delta = 1.e-4: increment of data likelihood at which :

convergence is declared

verbose=0: :

verbosity mode

evidence(x, like=None, verbose=0)

computation of evidence bound aka free energy

Parameters:

x array of shape (nb_samples,dim) :

the data from which evidence is computed

like=None: array of shape (nb_samples, self.k), optional :

component-wise likelihood If None, it is recomputed

verbose=0: verbosity model :

Returns:

ev (float) the computed evidence :

guess_priors(x, nocheck=0)

Set the priors in order of having them weakly uninformative this is from Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:

x, array of shape (nb_samples,self.dim) :

the data used in the estimation process

nocheck: boolean, optional, :

if nocheck==True, check is skipped

guess_regularizing(x, bcheck=1)

Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:

x array of shape (n_samples,dim) :

the data used in the estimation process

initialize(x)

initialize z using a k-means algorithm, then upate the parameters

Parameters:

x: array of shape (nb_samples,self.dim) :

the data used in the estimation process

initialize_and_estimate(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Estimation of self given x

Parameters:

x array of shape (n_samples,dim) :

the data from which the model is estimated

z = None: array of shape (n_samples) :

a prior labelling of the data to initialize the computation

niter=100: maximal number of iterations in the estimation process :

delta = 1.e-4: increment of data likelihood at which :

convergence is declared

ninit=1: number of initialization performed :

to reach a good solution

verbose=0: verbosity mode :

Returns:

the best model is returned :

likelihood(x)

return the likelihood of the model for the data x the values are weighted by the components weights

Parameters:

x: array of shape (nb_samples, self.dim) :

the data used in the estimation process

Returns:

like: array of shape(nb_samples, self.k) :

component-wise likelihood

map_label(x, like=None)

return the MAP labelling of x

Parameters:

x array of shape (nb_samples,dim) :

the data under study

like=None array of shape(nb_samples,self.k) :

component-wise likelihood if like==None, it is recomputed

Returns:

z: array of shape(nb_samples): the resulting MAP labelling :

of the rows of x

mixture_likelihood(x)

Returns the likelihood of the mixture for x

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

plugin(means, precisions, weights)

Set manually the weights, means and precision of the model

Parameters:

means: array of shape (self.k,self.dim) :

precisions: array of shape (self.k,self.dim,self.dim) :

or (self.k, self.dim)

weights: array of shape (self.k) :

pop(like, tiny=1e-15)

compute the population, i.e. the statistics of allocation

Parameters:

like array of shape (nb_samples, self.k): :

the likelihood of each item being in each class

probability_under_prior()

Compute the probability of the current parameters of self given the priors

sample(x, niter=1, mem=0, verbose=0)

sample the indicator and parameters

Parameters:

x array of shape (nb_samples,self.dim) :

the data used in the estimation process

niter=1 : the number of iterations to perform

mem=0: if mem, the best values of the parameters are computed :

verbose=0: verbosity mode :

Returns:

best_weights: array of shape (self.k) :

best_means: array of shape (self.k, self.dim) :

best_precisions: array of shape (self.k, self.dim, self.dim) :

possibleZ: array of shape (nb_samples, niter) :

the z that give the highest posterior to the data is returned first

sample_and_average(x, niter=1, verbose=0)

sample the indicator and parameters the average values for weights,means, precisions are returned

Parameters:

x = array of shape (nb_samples,dim) :

the data from which bic is computed

niter=1: number of iterations :

Returns:

weights: array of shape (self.k) :

means: array of shape (self.k,self.dim) :

precisions: array of shape (self.k,self.dim,self.dim) :

or (self.k, self.dim) these are the average parameters across samplings

Notes

All this makes sense only if no label switching as occurred so this is wrong in general (asymptotically).

fix: implement a permutation procedure for components identification

sample_indicator(like)

sample the indicator from the likelihood

Parameters:

like: array of shape (nb_samples,self.k) :

component-wise likelihood

Returns:

z: array of shape(nb_samples): a draw of the membership variable :

set_priors(prior_means, prior_weights, prior_scale, prior_dof, prior_shrinkage)

Set the prior of the BGMM

Parameters:

prior_means: array of shape (self.k,self.dim) :

prior_weights: array of shape (self.k) :

prior_scale: array of shape (self.k,self.dim,self.dim) :

prior_dof: array of shape (self.k) :

prior_shrinkage: array of shape (self.k) :

show(x, gd, density=None, axes=None)

Function to plot a GMM, still in progress Currently, works only in 1D and 2D

Parameters:

x: array of shape(n_samples, dim) :

the data under study

gd: GridDescriptor instance :

density: array os shape(prod(gd.n_bins)) :

density of the model one the discrete grid implied by gd by default, this is recomputed

show_components(x, gd, density=None, mpaxes=None)

Function to plot a GMM – Currently, works only in 1D

Parameters:

x: array of shape(n_samples, dim) :

the data under study

gd: GridDescriptor instance :

density: array os shape(prod(gd.n_bins)) :

density of the model one the discrete grid implied by gd by default, this is recomputed

mpaxes: axes handle to make the figure, optional, :

if None, a new figure is created

test(x, tiny=1e-15)

Returns the log-likelihood of the mixture for x

Parameters:

x array of shape (n_samples,self.dim) :

the data used in the estimation process

Returns:

ll: array of shape(n_samples) :

the log-likelihood of the rows of x

train(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

Idem initialize_and_estimate

unweighted_likelihood(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

Returns:

like, array of shape(n_samples,self.k) :

unweighted component-wise likelihood

Notes

Hopefully faster

unweighted_likelihood_(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:

x: array of shape (n_samples,self.dim) :

the data used in the estimation process

Returns:

like, array of shape(n_samples,self.k) :

unweighted component-wise likelihood

update(x, z)

update function (draw a sample of the GMM parameters)

Parameters:

x array of shape (nb_samples,self.dim) :

the data used in the estimation process

z array of shape (nb_samples), type = np.int :

the corresponding classification

update_means(x, z)

Given the allocation vector z, and the corresponding data x, resample the mean

Parameters:

x: array of shape (nb_samples,self.dim) :

the data used in the estimation process

z: array of shape (nb_samples), type = np.int :

the corresponding classification

update_precisions(x, z)

Given the allocation vector z, and the corresponding data x, resample the precisions

Parameters:

x array of shape (nb_samples,self.dim) :

the data used in the estimation process

z array of shape (nb_samples), type = np.int :

the corresponding classification

update_weights(z)

Given the allocation vector z, resample the weights parameter

Parameters:

z array of shape (nb_samples), type = np.int :

the allocation variable

Functions

nipy.algorithms.clustering.bgmm.detsh(H)

Routine for the computation of determinants of symmetric positive matrices

Parameters:

H array of shape(n,n) :

the input matrix, assumed symmmetric and positive

Returns:

dh: float, the determinant :

nipy.algorithms.clustering.bgmm.dirichlet_eval(w, alpha)

Evaluate the probability of a certain discrete draw w from the Dirichlet density with parameters alpha

Parameters:

w: array of shape (n) :

alpha: array of shape (n) :

nipy.algorithms.clustering.bgmm.dkl_dirichlet(w1, w2)

Returns the KL divergence between two dirichlet distribution

Parameters:

w1: array of shape(n), :

the parameters of the first dirichlet density

w2: array of shape(n), :

the parameters of the second dirichlet density

nipy.algorithms.clustering.bgmm.dkl_gaussian(m1, P1, m2, P2)

Returns the KL divergence between gausians densities

Parameters:

m1: array of shape (n), :

the mean parameter of the first density

P1: array of shape(n,n), :

the precision parameters of the first density

m2: array of shape (n), :

the mean parameter of the second density

P2: array of shape(n,n), :

the precision parameters of the second density

nipy.algorithms.clustering.bgmm.dkl_wishart(a1, B1, a2, B2)

returns the KL divergence bteween two Wishart distribution of parameters (a1,B1) and (a2,B2),

Parameters:

a1: Float, :

degrees of freedom of the first density

B1: array of shape(n,n), :

scale matrix of the first density

a2: Float, :

degrees of freedom of the second density

B2: array of shape(n,n), :

scale matrix of the second density

Returns:

dkl: float, the Kullback-Leibler divergence :

nipy.algorithms.clustering.bgmm.generate_Wishart(n, V)

Generate a sample from Wishart density

Parameters:

n: float, :

the number of degrees of freedom of the Wishart density

V: array of shape (n,n) :

the scale matrix of the Wishart density

Returns:

W: array of shape (n,n) :

the draw from Wishart density

nipy.algorithms.clustering.bgmm.generate_normals(m, P)

Generate a Gaussian sample with mean m and precision P

Parameters:

m array of shape n: the mean vector :

P array of shape (n,n): the precision matrix :

Returns:

ng : array of shape(n): a draw from the gaussian density

nipy.algorithms.clustering.bgmm.generate_perm(k, nperm=100)

returns an array of shape(nbperm, k) representing the permutations of k elements

Parameters:

k, int the number of elements to be permuted :

nperm=100 the maximal number of permutations :

if gamma(k+1)>nperm: only nperm random draws are generated :

Returns:

p: array of shape(nperm,k): each row is permutation of k :

nipy.algorithms.clustering.bgmm.multinomial(probabilities)

Generate samples form a miltivariate distribution

Parameters:

probabilities: array of shape (nelements, nclasses): :

likelihood of each element belongin to each class each row is assumedt to sum to 1 One sample is draw from each row, resulting in

Returns:

z array of shape (nelements): the draws, :

that take values in [0..nclasses-1]

nipy.algorithms.clustering.bgmm.normal_eval(mu, P, x, dP=None)

Probability of x under normal(mu, inv(P))

Parameters:

mu: array of shape (n), :

the mean parameter

P: array of shape (n, n), :

the precision matrix

x: array of shape (n), :

the data to be evaluated

Returns:

(float) the density :

nipy.algorithms.clustering.bgmm.wishart_eval(n, V, W, dV=None, dW=None, piV=None)

Evaluation of the probability of W under Wishart(n,V)

Parameters:

n: float, :

the number of degrees of freedom (dofs)

V: array of shape (n,n) :

the scale matrix of the Wishart density

W: array of shape (n,n) :

the sample to be evaluated

dV: float, optional, :

determinant of V

dW: float, optional, :

determinant of W

piV: array of shape (n,n), optional :

inverse of V

Returns:

(float) the density :