mofaflex.MOFAFLEX#
- class mofaflex.MOFAFLEX(**kwargs)#
The MOFA-FLEX model.
This class is not meant to be instantiated by the user. Rather, it is created by instantiating a
term.
Attributes table#
Feature names for each view. |
|
Group names. |
|
The likelihoods. |
|
Number of features in each view. |
|
Total number of features. |
|
Number of groups. |
|
Number of samples in each group. |
|
Total number of samples. |
|
Number of additive terms. |
|
Number of views. |
|
Sample names for each group. |
|
The additive terms. |
|
Total loss (negative ELBO) for each training epoch. |
|
View names. |
Methods table#
|
Fit the model using the provided data. |
|
Get the dispersion vectors for each view. |
|
Get the fraction of explained variance for each view and group. |
|
Impute values in the training data using the trained factorization. |
|
Load a saved MOFAFLEX model. |
Attributes#
- MOFAFLEX.feature_names#
Feature names for each view.
- MOFAFLEX.group_names#
Group names.
- MOFAFLEX.likelihoods#
The likelihoods.
- MOFAFLEX.n_features#
Number of features in each view.
- MOFAFLEX.n_features_total#
Total number of features.
- MOFAFLEX.n_groups#
Number of groups.
- MOFAFLEX.n_samples#
Number of samples in each group.
- MOFAFLEX.n_samples_total#
Total number of samples.
- MOFAFLEX.n_terms#
Number of additive terms.
- MOFAFLEX.n_views#
Number of views.
- MOFAFLEX.sample_names#
Sample names for each group.
- MOFAFLEX.terms#
The additive terms.
- MOFAFLEX.training_loss#
Total loss (negative ELBO) for each training epoch.
- MOFAFLEX.view_names#
View names.
Methods#
- MOFAFLEX.fit(data, *, likelihoods=None, group_by=None, layer=None, use_obs='union', use_var='union', subset_var='highly_variable', plot_data_overview=True, remove_constant_features=True, device='cuda', batch_size=0, max_epochs=10000, lr=0.001, early_stopper_patience=100, save_path=None, seed=None, num_workers=0, pin_memory=False, n_particles=1)#
Fit the model using the provided data.
- Parameters:
data (
MuData|Mapping[str,Mapping[str,AnnData]] |AnnData) –can be any of:
MuData object
Nested dict with group names as keys, view names as subkeys and AnnData objects as values (incompatible with
.group_by)
likelihoods (
Union[Mapping[str,Union[Literal['Bernoulli','NegativeBinomial','Normal'],Likelihood]],Literal['Bernoulli','NegativeBinomial','Normal'],Likelihood,None] (default:None)) – Data likelihoods for each view (if dict) or for all views (if str or Likelihood). Inferred automatically if None.group_by (
str|Sequence[str] |None(default:None)) – Columns of.obsinMuDataorAnnDataobjects to group data by. Ignored if the input data is not aMuDataorAnnDataobject.layer (
Mapping[str,str|None] |Mapping[str,Mapping[str,str|None]] |str|None(default:None)) – Which layer to use. IfNone, the.Xelement will be used. Ifstr, the same layer will be used for all groups and views. If a dict of strings, the keys must correspond to view names and the values to layers. If a nested dict, different layers can be used for each combination of group and view. The last format is only accepted if the data is a nested dictionary ofAnnDataobjects.use_obs (
Literal['union','intersection'] (default:'union')) – How to align observations across views. Ignored if the data is not a nested dict ofAnnDataobjects.use_var (
Literal['union','intersection'] (default:'union')) – How to align variables across groups. Ignored if the data is not a nested dict ofAnnDataobjects.subset_var (
str|None(default:'highly_variable')) –.varcolumn with boolean values to select features.plot_data_overview (
bool(default:True)) – Plot data overview.remove_constant_features (
bool(default:True)) – Remove constant features from the data.device (
str|device(default:'cuda')) – Device to run training on.batch_size (
int(default:0)) – Batch size.max_epochs (
int(default:10000)) – Maximum number of training epochs.lr (
float(default:0.001)) – Learning rate.early_stopper_patience (
int(default:100)) – Number of steps without relevant improvement to stop training.save_path (
Path|str|None(default:None)) – Path to save model.seed (
int|None(default:None)) – Seed for the pseudorandom number generator.num_workers (
int(default:0)) – Number of data loader workers.pin_memory (
bool(default:False)) – Whether to use pinned memory in the data loader.n_particles (
int(default:1)) – Number of particles for ELBO estimation.
- MOFAFLEX.get_dispersion(moment='mean')#
Get the dispersion vectors for each view.
- MOFAFLEX.get_r2(type=None, ordered=False, term=None)#
Get the fraction of explained variance for each view and group.
- Parameters:
type (
Optional[Literal['total','byterm','term']] (default:None)) –How fine-grained the fraction of explained variance should be split up.
total: Returns the total fraction of explained variance.byterm: Returns the fraction of explained variance for each additive term.term: Returns the fraction of explained variance for each component (e.g. factor) of the given term.
Defaults to
termif the model has only one additive term,bytermotherwise.ordered (
bool(default:False)) – Whether to sort the returned dataframes by explained variance (highest to lowest, per group and view). Has no effect fortype="total".term (
str|None(default:None)) – The name of the additive term fortype="term". Can beNoneif the model has only one term.
- Return type:
- MOFAFLEX.impute_data(data, missing_only=False)#
Impute values in the training data using the trained factorization.
The data will be transformed into a space compatible with model predictions. Usually that involves shifting and/or scaling, e.g. Gaussian data will be mean-centered and scaled to unit variance. This also implies that only dense matrices can be returned. Be aware that this can result in high memory consumption.
- Parameters:
- Return type:
- Returns:
Nested dictionary of AnnData objects with either fully imputed data or with only the missing values filled in.
- classmethod MOFAFLEX.load(path, map_location=None)#
Load a saved MOFAFLEX model.