Decoder

From the mathematical perspective, the decoder approximates the measurement function. As opposed to the encoder, the encoder is modality-specific, implemented through GLMs. It receives a latent state and transforms it to the observational space. As the model is a VAE, the encoder parametrizes a distribution in the observational space. This distribution is chosen to fit the modality's assumed distribution: Poisson or Negative Binomial for count data, normal distribution for Gaussian data, categorical distribution for categorical data etc.

The decoders are set using the decoder parameter. As there are modality-specific decoders, there has to be one decoder for each modality of each measurement. Thus the decoder expects a value of type dict[str, dict[str, DecoderOption]]. The outer dictionary's keys are the measurement identifiers, the keys of the inner dictionaries' the modality identifiers.

Decoder options

The DecoderOption is a dictionary which expects two parameters, name and hyperparameters. With the key name the decoder is chosen, hyperparameters are decoder-specific parameters. The options are as follows:

Categorical

name should be set to Categorical. A decoder for a modality with one-hot-encoded categorical data. It has following hyperparameters:

category_weights (Optional[[list[float]]], default=None): Weighs the categories with the given floats when calculating the negative log likelihood.

Binary categorical

name should be set to BinaryCategorical. A decoder for a modality with a binary category. The cateory does not have to be put into a one-hot-encoded form. Note that the Categorical decoder can also be used for the same effect. It has the following hyperparmaeters:

threshold (float, default=0.5): The threshold to decide between the two categories. Used only for generating new samples.

Negative binomial

name should be set to NegativeBinomial. A decoder for a modality of count data which is assumed to be overdispersed. It has following hyperparameters:

identity_B (bool, default=False): Whether a linear layer should be learned (False) or not (True) between the readout units and the observations. This setting can only work if the observations have less or equal dimensions than the readout space (== dim_forcing).

Zero-inflated Negative Binomial

name should be set to ZeroInflatedNegativeBinomial. A decoder for a modality of count data which is assumed to be overdispersed and containing excess zeros. It has following hyperparameters:

identity_B: See Negative Binomial for details.
max_expected_spikes_per_bin (float, default=100000): Clamps the number of expected count per bin between 0 and the given number. Thus the model cannot generate (disregarding noise) more than the given number in every bin. Can be helpful in case of divergence. TODO rename

Exponential

name should be set to Exponential. A decoder for a modality of exponentially distributed data. Has no additional hyperparameters.

Gaussian

name should be set to Gaussian. A decoder for a modality of normal distributed data. It has following hyperparameters:

identity_B: See Negative Binomial for details.
learn_cov (bool, default=False): Whether the covariance should be learned.
initial_cov (float, default=0.01**2): The initial value for every value in the diagonal covariance matrix.

Identity

name should be set to Identity. The decoder pair of the identity encoder. Does not apply any transformation to the readout values. An identity covariance matrix is assumed, which simplifies the log likelihood calculation to a negative MSE. Has no additional parameters.

No Decoder

name should be set to NoDecoder. A placeholder decoder for a modality which should not be considered in the losses.

Decoder scaling

There is no guarantee that the different modalities contribute to the loss on the same magnitude. This can be solved by scaling the log likelihoods returned by the decoders to keep them to the same magnitude. This can be configured using the decoder_scaling and decoder_scaling_adjustment_epochs options.

`decoder_scaling`

This can be set the following ways:

None: No scaling is applied or scaling is automatic (see decoder_scaling_adjustment_epochs).
dict[str, float]: Similarly to the decoders, the keys of the dictionary represent the measurements, the float is the scaling applied to all modalities in the given measurement.
dict[str, dict[str, float]]: The inner dictionary's keys represent the modalities, each modality is applied the related scaling.

`decoder_scaling_adjustment_epochs`

This option sets the epochs where the decoder scaling is adjusted automatically. This is done by monitoring the log likelihoods and determining the scaling factors necessary. If an empty list is passed, the decoder scaling won't be adjusted automatically.

Decoder options​

Categorical​

Binary categorical​

Negative binomial​

Zero-inflated Negative Binomial​

Exponential​

Gaussian​

Identity​

No Decoder​

Decoder scaling​

decoder_scaling​

decoder_scaling_adjustment_epochs​