Encoder

From the mathematical perspective, the encoder approximates the inverse of the measurement function. As opposed to the decoder, the encoder is modality-agnostic, meaning that the encoder receives the concatenated observations and returns their latent representation. As the model is a VAE, the encoder parametrizes a distribution in the latent space. This distribution is chosen to be the normal distribution, thus the encoder returns its mean and the diagonal covariance matrix. In all implemented encoders, the encoder parametrizes the logarithm of the covariance matrix, making sure that each variance is positive.

In the MTF framework the encoder's role is twofold. During training it provides the initial condition of the latent DSR model, as well as providing the teacher forcing signal, thus when using GTF or STF, the encoded trajectory is used to "pull back" the DSR generated trajectory.

note

As mentioned before, the encoder's bottleneck dimension is dim_forcing and not latent_dim. Thus for initializing the latent model, a linear projection matrix $L$ is applied, transforming the the dim_forcing dimensional encoder output to a latent_dim - dim_forcing dimensional vector. This is concatenated with the encoder output to arrive at the full initial latent state.

The encoder is set through the encoder parameter. It expects a dictionary with two keys: name and hyperparameters. The implemented encoders are the following.

note

As the encoder is modality-agnostic, one can only set one encoder type. The model will initiate one encoder for each measurement, each one of the same type.

Stacked Convolutions

This encoder is made up of 1D convolutions, each going over the time axis, with the feature dimension as convolutional channels. Doing convolution over time allows for better estimation of the latent state in case of possible delays and timing mismatches between observations. It features two convolutional networks, one for the mean and one for the covariance.

It has the following hyperparameters:

mean_kernel_sizes (list[int], default=[11, 7, 5, 3]): The kernel sizes of the convolutions in the network returning the mean. Stride is always 1, and padding is set such that the timeseries' length does not change.
logvar_kernel_sizes (list[int], default=[11]): The kernel sizes of the convolutions in the network returning the log covariance. Stride is always 1, and padding is set such that the timeseries' length does not change.
deterministic (bool, default=False): A flag whether the distribution is sampled from during training or just the mean is returned. If set to True, the model may not be able to filter out noise effectively.
causal (bool, default=False): A flag whether the convolutions are able to "look into the future". If True, the convolutions can only use past timesteps when transforming into the latent state. Non-causal networks may be used if only long-term statistics are of interest. For short timeseries it may result in "cheating", wherin the network stores information from the future into the initialization of the trajectory.

Linear

This encoder implements two linear transformations, going from the observational space to the mean and the log covariance of the encoder distribution.

It has the following hyperparameters:

deterministic (bool, default=False): A flag whether the distribution is sampled from during training or just the mean is returned. If set to True, the model may not be able to filter out noise effectively.

Identity

This encoder is useful for research purposes and for Gaussian data, where the encoder is not needed. It simply passes the data through to the latent state. Similarly to the other options, excess latent states are initialized through a linear transformation of the observations.

Stacked Convolutions​

Linear​

Identity​

Stacked Convolutions

Linear

Identity