mofaflex.MOFAFLEX

mofaflex.MOFAFLEX#

class mofaflex.MOFAFLEX(data, *args)#

Fit the model using the provided data.

Parameters:

data (MuData | Mapping[str, Mapping[str, AnnData]]) –
can be any of:
- MuData object
- Nested dict with group names as keys, view names as subkeys and AnnData objects as values (incompatible with TrainingOptions .group_by)
*args (_Options) – Options for training.

Attributes table#

`covariates`	Covariates for each group, if using a GP prior.
`covariates_names`	Covariate names for each group where they could be inferred from the input.
`factor_names`	Factor names.
`factor_order`	Ordering of factors by explained variance (highest to lowest).
`feature_names`	Feature names for each view.
`gp_group_correlation`	Between-group correlation for each factor, if using a GP prior.
`gp_lengthscale`	Inferred lengthscales for each factor, if using a GP prior.
`gp_scale`	Inferred variance scales (smoothness) for each factor, if using a GP prior.
`group_names`	Group names.
`n_factors`	Total number of factors.
`n_features`	Number of features in each view.
`n_features_total`	Total number of features.
`n_groups`	Number of groups.
`n_guided_factors`	Number of guided factors.
`n_informed_factors`	Number of informed factors.
`n_samples`	Number of samples in each group.
`n_samples_total`	Total number of samples.
`n_uninformed_factors`	Number of uninformed factors.
`n_views`	Number of views.
`sample_names`	Sample names for each group.
`training_loss`	Total loss (negative ELBO) for each training epoch.
`view_names`	View names.
`warped_covariates`	Time-warped covariates for each group, if using a GP prior and dynamic time warping was enabled.

Methods table#

`get_annotations`([return_type, ordered])	Get the annotation matrices for each view.
`get_dispersion`([return_type, moment])	Get the dispersion vectors for each view.
`get_factors`([return_type, moment, ...])	Get the factor matrices Z for each group.
`get_gps`([return_type, moment, x, ...])	Get all latent functions.
`get_r2`([total, ordered])	Get the fraction of explained variance for each view and group.
`get_significant_factor_annotations`()	Get the results of significance testing of annotations against factors.
`get_sparse_factor_probabilities`([...])	Get the probabilties that a factor value is non-sparse for each group with a spike and slab factor prior.
`get_sparse_weight_probabilities`([...])	Get the probabilties that a weight value is non-sparse for each view with a spike and slab view prior.
`get_weights`([return_type, moment, ...])	Get the weight matrices W for each view.
`impute_data`(data[, missing_only])	Impute values in the training data using the trained factorization.
`load`(path[, map_location])	Load a saved MOFAFLEX model.

Attributes#

MOFAFLEX.covariates#: Covariates for each group, if using a GP prior.

MOFAFLEX.covariates_names#: Covariate names for each group where they could be inferred from the input.

MOFAFLEX.factor_names#: Factor names.

MOFAFLEX.factor_order#: Ordering of factors by explained variance (highest to lowest).

MOFAFLEX.feature_names#: Feature names for each view.

MOFAFLEX.gp_group_correlation#: Between-group correlation for each factor, if using a GP prior.

MOFAFLEX.gp_lengthscale#: Inferred lengthscales for each factor, if using a GP prior.

MOFAFLEX.gp_scale#: Inferred variance scales (smoothness) for each factor, if using a GP prior.

MOFAFLEX.group_names#: Group names.

MOFAFLEX.n_factors#: Total number of factors.

MOFAFLEX.n_features#: Number of features in each view.

MOFAFLEX.n_features_total#: Total number of features.

MOFAFLEX.n_groups#: Number of groups.

MOFAFLEX.n_guided_factors#: Number of guided factors.

MOFAFLEX.n_informed_factors#: Number of informed factors.

MOFAFLEX.n_samples#: Number of samples in each group.

MOFAFLEX.n_samples_total#: Total number of samples.

MOFAFLEX.n_uninformed_factors#: Number of uninformed factors.

MOFAFLEX.n_views#: Number of views.

MOFAFLEX.sample_names#: Sample names for each group.

MOFAFLEX.training_loss#: Total loss (negative ELBO) for each training epoch.

MOFAFLEX.view_names#: View names.

MOFAFLEX.warped_covariates#: Time-warped covariates for each group, if using a GP prior and dynamic time warping was enabled.

Methods#

MOFAFLEX.get_annotations(return_type='pandas', ordered=False)#

Get the annotation matrices for each view.

Parameters:

return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.
ordered (default: False) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_dispersion(return_type='pandas', moment='mean')#

Get the dispersion vectors for each view.

Parameters:

return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.
moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.

Return type:

dict[str, Series | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_factors(return_type='pandas', moment='mean', sparse_type='mix', ordered=False)#

Get the factor matrices Z for each group.

Parameters:

return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.
moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.
sparse_type (Literal['raw', 'mix', 'thresh'] (default: 'mix')) –
How to handle sparsity when using the spike and slab prior.
- raw: Do nothing, return inferred values for all entries.
- mix: Return the corresponding moment of a mixture distribution of two Normal distributions: One centered at 0 and the other centered at the inferred non-sparse value. The mixture is weighted by the inferred sparsity probability. This is what MOFA does.
- thresh: Set all values with a sparsity probablity > 0.5 to 0.
ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_gps(return_type='pandas', moment='mean', x=None, batch_size=None, ordered=False)#

Get all latent functions.

Parameters:

return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.
moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.
x (Mapping[str, ndarray | Tensor] | None (default: None)) – Covariate values for each group. If None, will return latent function values at covariate coordinates used for training.
batch_size (int | None (default: None)) – Minibatch size. Only has an effect if x is not None. Defaults to the minibatch size used for training.
ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_r2(total=False, ordered=False)#

Get the fraction of explained variance for each view and group.

Parameters:

total (bool (default: False)) – If True, returns a DataFrame with fraction of explained variance for the full model for each group (columns) and view (rows). Otherwise returns a dict with group names as keys containing DataFrames with the fraction of explained variance for each view (columns) and factor(rows).
ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest). Has no effect if total == True.

Return type:

DataFrame | dict[str, DataFrame]

MOFAFLEX.get_significant_factor_annotations()#

Get the results of significance testing of annotations against factors.

The significance testing is an implementation of PCGSE [FLM15]. While originally intended to assign annotations to uninformed factors, here it is used as a diagnostic plot to find factors that are mismatched to their annotations.

Return type:: dict[str, DataFrame] | None
Returns:: PCGSE results for each view or None if the model does not have prior annotations.

MOFAFLEX.get_sparse_factor_probabilities(return_type='pandas', ordered=False)#

Get the probabilties that a factor value is non-sparse for each group with a spike and slab factor prior.

Parameters:

return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.
ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_sparse_weight_probabilities(return_type='pandas', ordered=False)#

Get the probabilties that a weight value is non-sparse for each view with a spike and slab view prior.

Parameters:

return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.
ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_weights(return_type='pandas', moment='mean', sparse_type='mix', ordered=False)#

Get the weight matrices W for each view.

Parameters:

return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.
moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.
sparse_type (Literal['raw', 'mix', 'thresh'] (default: 'mix')) –
How to handle sparsity when using the spike and slab prior.
- raw: Do nothing, return inferred values for all entries.
- mix: Return the corresponding moment of a mixture distribution of two Normal distributions: One centered at 0 and the other centered at the inferred non-sparse value. The mixture is weighted by the inferred sparsity probability. This is what MOFA does.
- thresh: Set all values with a sparsity probablity > 0.5 to 0.
ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.impute_data(data, missing_only=False)#

Impute values in the training data using the trained factorization.

Parameters:

data (MuData | Mapping[str, Mapping[str, AnnData]]) – The data the model was trained on.
missing_only (default: False) – Only impute missing values in the data.

Return type:

dict[dict[str, AnnData]]

Returns:

Nested dictionary of AnnData objects with either fully imputed data or with only the missing values filled in. In both cases, the returned data will be preprocessed. In the case of Gaussian distributed data, that involves centering and scaling.

classmethod MOFAFLEX.load(path, map_location=None)#

Load a saved MOFAFLEX model.

Parameters:

path (str | Path) – Path to the saved model file.
map_location (default: None) – Specify how to remap storage locations for PyTorch tensors. See the torch.load documentation for details.

Return type:

MOFAFLEX

mofaflex.MOFAFLEX

Contents

mofaflex.MOFAFLEX#

Attributes table#

Methods table#

Attributes#

Methods#