ConditionalUnet1D#

class ConditionalUnet1D(input_dim, local_cond_dim=None, global_cond_dim=None, diffusion_step_embed_dim=16, down_dims=[32, 64], kernel_size=3, n_groups=1, cond_predict_scale=True)[source]#

Bases: Module

A 1D/Time U-Net for predicting conditional score/velocity fields.

The model defines:

\[\mathbf{f}_{\mathbf{\theta}}: \mathbb{R}^{d_x \times L} \times \mathbb{R}^{d_{\mathrm{local}}} \times \mathbb{R}^{d_{\mathrm{global}}} \times \mathbb{R} \to \mathbb{R}^{d_x \times L}.\]

It can be used to parameterize score-like or velocity-like targets in conditional diffusion/flow pipelines, e.g. \(\nabla \log \mathbb{P}(Y\mid X)\).

This baseline model is unconstrained with respect to group actions (no explicit equivariance/invariance constraints are imposed).

The influence of x in the diffusion process is captured via local and global conditioning of the Unet architecture.

Local conditioning: Provided a local conditioning encoder z(x), the output of the encoder is: concatenated to the input of the Unet architecture.
Global conditioning: Provided a global conditioning vector c = b(x), the output of the encoder is: used to modulate the convolutional layers of the Unet architecture via Feature-Wise Linear Modulation (FiLM) modulation.

Parameters:

input_dim (int) – The dimension of the input data.
local_cond_dim (int, optional) – The dimension of the local conditioning vector. Defaults to None.
global_cond_dim (int, optional) – The dimension of the global conditioning vector. Defaults to None.
diffusion_step_embed_dim (int, optional) – The dimension of the diffusion step embedding. Defaults to 256.
down_dims (list, optional) – A list of dimensions for the downsampling path. Defaults to [256, 512, 1024].
kernel_size (int, optional) – The size of the convolutional kernel. Defaults to 3.
n_groups (int, optional) – The number of groups for GroupNorm. Defaults to 8.
cond_predict_scale (bool, optional) – Whether to predict the scale for conditioning. Defaults to False.

forward(sample, timestep, local_cond=None, film_cond=None, **kwargs)[source]#

Forward pass of the Conditional Unet 1D model.

Parameters:

sample (Tensor) – The input tensor of shape (B, input_dim, T).
timestep (Tensor | float | int) – The diffusion timestep.
local_cond (Tensor, optional) – The local conditioning tensor of shape (B, local_cond_dim). Defaults to None.
film_cond (Tensor, optional) – The global conditioning tensor of shape (B, film_cond_dim). Defaults to None.
kwargs – Additional keyword arguments reserved for API compatibility.

Returns:

The output tensor of shape (B, input_dim, T).

Return type:

Tensor