mofaflex.MOFAFLEX#

class mofaflex.MOFAFLEX(data, *args)#

Fit the model using the provided data.

Parameters:
  • data (MuData | Mapping[str, Mapping[str, AnnData]]) –

    can be any of:

    • MuData object

    • Nested dict with group names as keys, view names as subkeys and AnnData objects as values (incompatible with TrainingOptions .group_by)

  • *args (_Options) – Options for training.

Attributes table#

covariates

Covariates for each group, if using a GP prior.

covariates_names

Covariate names for each group where they could be inferred from the input.

factor_names

Factor names.

factor_order

Ordering of factors by explained variance (highest to lowest).

feature_names

Feature names for each view.

gp_group_correlation

Between-group correlation for each factor, if using a GP prior.

gp_lengthscale

Inferred lengthscales for each factor, if using a GP prior.

gp_scale

Inferred variance scales (smoothness) for each factor, if using a GP prior.

group_names

Group names.

n_factors

Total number of factors.

n_features

Number of features in each view.

n_features_total

Total number of features.

n_groups

Number of groups.

n_guided_factors

Number of guided factors.

n_informed_factors

Number of informed factors.

n_samples

Number of samples in each group.

n_samples_total

Total number of samples.

n_uninformed_factors

Number of uninformed factors.

n_views

Number of views.

sample_names

Sample names for each group.

training_loss

Total loss (negative ELBO) for each training epoch.

view_names

View names.

warped_covariates

Time-warped covariates for each group, if using a GP prior and dynamic time warping was enabled.

Methods table#

get_annotations([return_type, ordered])

Get the annotation matrices for each view.

get_dispersion([return_type, moment])

Get the dispersion vectors for each view.

get_factors([return_type, moment, ...])

Get the factor matrices Z for each group.

get_gps([return_type, moment, x, ...])

Get all latent functions.

get_r2([total, ordered])

Get the fraction of explained variance for each view and group.

get_significant_factor_annotations()

Get the results of significance testing of annotations against factors.

get_sparse_factor_probabilities([...])

Get the probabilties that a factor value is non-sparse for each group with a spike and slab factor prior.

get_sparse_weight_probabilities([...])

Get the probabilties that a weight value is non-sparse for each view with a spike and slab view prior.

get_weights([return_type, moment, ...])

Get the weight matrices W for each view.

impute_data(data[, missing_only])

Impute values in the training data using the trained factorization.

load(path[, map_location])

Load a saved MOFAFLEX model.

Attributes#

MOFAFLEX.covariates#

Covariates for each group, if using a GP prior.

MOFAFLEX.covariates_names#

Covariate names for each group where they could be inferred from the input.

MOFAFLEX.factor_names#

Factor names.

MOFAFLEX.factor_order#

Ordering of factors by explained variance (highest to lowest).

MOFAFLEX.feature_names#

Feature names for each view.

MOFAFLEX.gp_group_correlation#

Between-group correlation for each factor, if using a GP prior.

MOFAFLEX.gp_lengthscale#

Inferred lengthscales for each factor, if using a GP prior.

MOFAFLEX.gp_scale#

Inferred variance scales (smoothness) for each factor, if using a GP prior.

MOFAFLEX.group_names#

Group names.

MOFAFLEX.n_factors#

Total number of factors.

MOFAFLEX.n_features#

Number of features in each view.

MOFAFLEX.n_features_total#

Total number of features.

MOFAFLEX.n_groups#

Number of groups.

MOFAFLEX.n_guided_factors#

Number of guided factors.

MOFAFLEX.n_informed_factors#

Number of informed factors.

MOFAFLEX.n_samples#

Number of samples in each group.

MOFAFLEX.n_samples_total#

Total number of samples.

MOFAFLEX.n_uninformed_factors#

Number of uninformed factors.

MOFAFLEX.n_views#

Number of views.

MOFAFLEX.sample_names#

Sample names for each group.

MOFAFLEX.training_loss#

Total loss (negative ELBO) for each training epoch.

MOFAFLEX.view_names#

View names.

MOFAFLEX.warped_covariates#

Time-warped covariates for each group, if using a GP prior and dynamic time warping was enabled.

Methods#

MOFAFLEX.get_annotations(return_type='pandas', ordered=False)#

Get the annotation matrices for each view.

Parameters:
  • return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.

  • ordered (default: False) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_dispersion(return_type='pandas', moment='mean')#

Get the dispersion vectors for each view.

Parameters:
  • return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.

  • moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.

Return type:

dict[str, Series | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_factors(return_type='pandas', moment='mean', sparse_type='mix', ordered=False)#

Get the factor matrices Z for each group.

Parameters:
  • return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.

  • moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.

  • sparse_type (Literal['raw', 'mix', 'thresh'] (default: 'mix')) –

    How to handle sparsity when using the spike and slab prior.

    • raw: Do nothing, return inferred values for all entries.

    • mix: Return the corresponding moment of a mixture distribution of two Normal distributions: One centered at 0 and the other centered at the inferred non-sparse value. The mixture is weighted by the inferred sparsity probability. This is what MOFA does.

    • thresh: Set all values with a sparsity probablity > 0.5 to 0.

  • ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_gps(return_type='pandas', moment='mean', x=None, batch_size=None, ordered=False)#

Get all latent functions.

Parameters:
  • return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.

  • moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.

  • x (Mapping[str, ndarray | Tensor] | None (default: None)) – Covariate values for each group. If None, will return latent function values at covariate coordinates used for training.

  • batch_size (int | None (default: None)) – Minibatch size. Only has an effect if x is not None. Defaults to the minibatch size used for training.

  • ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_r2(total=False, ordered=False)#

Get the fraction of explained variance for each view and group.

Parameters:
  • total (bool (default: False)) – If True, returns a DataFrame with fraction of explained variance for the full model for each group (columns) and view (rows). Otherwise returns a dict with group names as keys containing DataFrames with the fraction of explained variance for each view (columns) and factor(rows).

  • ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest). Has no effect if total == True.

Return type:

DataFrame | dict[str, DataFrame]

MOFAFLEX.get_significant_factor_annotations()#

Get the results of significance testing of annotations against factors.

The significance testing is an implementation of PCGSE [FLM15]. While originally intended to assign annotations to uninformed factors, here it is used as a diagnostic plot to find factors that are mismatched to their annotations.

Return type:

dict[str, DataFrame] | None

Returns:

PCGSE results for each view or None if the model does not have prior annotations.

MOFAFLEX.get_sparse_factor_probabilities(return_type='pandas', ordered=False)#

Get the probabilties that a factor value is non-sparse for each group with a spike and slab factor prior.

Parameters:
  • return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.

  • ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_sparse_weight_probabilities(return_type='pandas', ordered=False)#

Get the probabilties that a weight value is non-sparse for each view with a spike and slab view prior.

Parameters:
  • return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.

  • ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.get_weights(return_type='pandas', moment='mean', sparse_type='mix', ordered=False)#

Get the weight matrices W for each view.

Parameters:
  • return_type (Literal['pandas', 'anndata', 'numpy'] (default: 'pandas')) – Format of the returned object.

  • moment (Literal['mean', 'std'] (default: 'mean')) – Which moment of the posterior distribution to return.

  • sparse_type (Literal['raw', 'mix', 'thresh'] (default: 'mix')) –

    How to handle sparsity when using the spike and slab prior.

    • raw: Do nothing, return inferred values for all entries.

    • mix: Return the corresponding moment of a mixture distribution of two Normal distributions: One centered at 0 and the other centered at the inferred non-sparse value. The mixture is weighted by the inferred sparsity probability. This is what MOFA does.

    • thresh: Set all values with a sparsity probablity > 0.5 to 0.

  • ordered (bool (default: False)) – Whether to return the factors ordered by explained variance (highest to lowest).

Return type:

dict[str, DataFrame | AnnData | ndarray[tuple[Any, ...], dtype[single]]]

MOFAFLEX.impute_data(data, missing_only=False)#

Impute values in the training data using the trained factorization.

Parameters:
  • data (MuData | Mapping[str, Mapping[str, AnnData]]) – The data the model was trained on.

  • missing_only (default: False) – Only impute missing values in the data.

Return type:

dict[dict[str, AnnData]]

Returns:

Nested dictionary of AnnData objects with either fully imputed data or with only the missing values filled in. In both cases, the returned data will be preprocessed. In the case of Gaussian distributed data, that involves centering and scaling.

classmethod MOFAFLEX.load(path, map_location=None)#

Load a saved MOFAFLEX model.

Parameters:
  • path (str | Path) – Path to the saved model file.

  • map_location (default: None) – Specify how to remap storage locations for PyTorch tensors. See the torch.load documentation for details.

Return type:

MOFAFLEX