sklearn.decomposition.fastica

sklearn.decomposition.fastica(X, n_components=None, algorithm='parallel', whiten=True, fun='logcosh', fun_args=None, max_iter=200, tol=0.0001, w_init=None, random_state=None, return_X_mean=False, compute_sources=True, return_n_iter=False)[source]

Perform Fast Independent Component Analysis.

Read more in the User Guide.

Parameters:
X : array-like, shape (n_samples, n_features)

Training vector, where n_samples is the number of samples and n_features is the number of features.

n_components : int, optional

Number of components to extract. If None no dimension reduction is performed.

algorithm : {‘parallel’, ‘deflation’}, optional

Apply a parallel or deflational FASTICA algorithm.

whiten : boolean, optional

If True perform an initial whitening of the data. If False, the data is assumed to have already been preprocessed: it should be centered, normed and white. Otherwise you will get incorrect results. In this case the parameter n_components will be ignored.

fun : string or function, optional. Default: ‘logcosh’

The functional form of the G function used in the approximation to neg-entropy. Could be either ‘logcosh’, ‘exp’, or ‘cube’. You can also provide your own function. It should return a tuple containing the value of the function, and of its derivative, in the point. The derivative should be averaged along its last dimension. Example:

def my_g(x):

return x ** 3, np.mean(3 * x ** 2, axis=-1)

fun_args : dictionary, optional

Arguments to send to the functional form. If empty or None and if fun=’logcosh’, fun_args will take value {‘alpha’ : 1.0}

max_iter : int, optional

Maximum number of iterations to perform.

tol : float, optional

A positive scalar giving the tolerance at which the un-mixing matrix is considered to have converged.

w_init : (n_components, n_components) array, optional

Initial un-mixing array of dimension (n.comp,n.comp). If None (default) then an array of normal r.v.’s is used.

random_state : int, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

return_X_mean : bool, optional

If True, X_mean is returned too.

compute_sources : bool, optional

If False, sources are not computed, but only the rotation matrix. This can save memory when working with big data. Defaults to True.

return_n_iter : bool, optional

Whether or not to return the number of iterations.

Returns:
K : array, shape (n_components, n_features) | None.

If whiten is ‘True’, K is the pre-whitening matrix that projects data onto the first n_components principal components. If whiten is ‘False’, K is ‘None’.

W : array, shape (n_components, n_components)

Estimated un-mixing matrix. The mixing matrix can be obtained by:

w = np.dot(W, K.T)
A = w.T * (w * w.T).I
S : array, shape (n_samples, n_components) | None

Estimated source matrix

X_mean : array, shape (n_features, )

The mean over features. Returned only if return_X_mean is True.

n_iter : int

If the algorithm is “deflation”, n_iter is the maximum number of iterations run across all components. Else they are just the number of iterations taken to converge. This is returned only when return_n_iter is set to True.

Notes

The data matrix X is considered to be a linear combination of non-Gaussian (independent) components i.e. X = AS where columns of S contain the independent components and A is a linear mixing matrix. In short ICA attempts to un-mix’ the data by estimating an un-mixing matrix W where ``S = W K X.`

This implementation was originally made for data of shape [n_features, n_samples]. Now the input is transposed before the algorithm is applied. This makes it slightly faster for Fortran-ordered input.

Implemented using FastICA: A. Hyvarinen and E. Oja, Independent Component Analysis: Algorithms and Applications, Neural Networks, 13(4-5), 2000, pp. 411-430