ConditionalUnet1D#
- class ConditionalUnet1D(input_dim, local_cond_dim=None, global_cond_dim=None, diffusion_step_embed_dim=16, down_dims=[32, 64], kernel_size=3, n_groups=1, cond_predict_scale=True)[source]#
Bases:
ModuleA 1D/Time U-Net for predicting conditional score/velocity fields.
The model defines:
\[\mathbf{f}_{\mathbf{\theta}}: \mathbb{R}^{d_x \times L} \times \mathbb{R}^{d_{\mathrm{local}}} \times \mathbb{R}^{d_{\mathrm{global}}} \times \mathbb{R} \to \mathbb{R}^{d_x \times L}.\]It can be used to parameterize score-like or velocity-like targets in conditional diffusion/flow pipelines, e.g. \(\nabla \log \mathbb{P}(Y\mid X)\).
This baseline model is unconstrained with respect to group actions (no explicit equivariance/invariance constraints are imposed).
The influence of x in the diffusion process is captured via local and global conditioning of the Unet architecture.
- Local conditioning: Provided a local conditioning encoder z(x), the output of the encoder is
concatenated to the input of the Unet architecture.
- Global conditioning: Provided a global conditioning vector c = b(x), the output of the encoder is
used to modulate the convolutional layers of the Unet architecture via Feature-Wise Linear Modulation (FiLM) modulation.
- Parameters:
input_dim (
int) – The dimension of the input data.local_cond_dim (
int, optional) – The dimension of the local conditioning vector. Defaults to None.global_cond_dim (
int, optional) – The dimension of the global conditioning vector. Defaults to None.diffusion_step_embed_dim (
int, optional) – The dimension of the diffusion step embedding. Defaults to 256.down_dims (
list, optional) – A list of dimensions for the downsampling path. Defaults to [256, 512, 1024].kernel_size (
int, optional) – The size of the convolutional kernel. Defaults to 3.n_groups (
int, optional) – The number of groups for GroupNorm. Defaults to 8.cond_predict_scale (
bool, optional) – Whether to predict the scale for conditioning. Defaults to False.
- forward(sample, timestep, local_cond=None, film_cond=None, **kwargs)[source]#
Forward pass of the Conditional Unet 1D model.
- Parameters:
sample (
Tensor) – The input tensor of shape (B, input_dim, T).local_cond (
Tensor, optional) – The local conditioning tensor of shape (B, local_cond_dim). Defaults to None.film_cond (
Tensor, optional) – The global conditioning tensor of shape (B, film_cond_dim). Defaults to None.kwargs – Additional keyword arguments reserved for API compatibility.
- Returns:
The output tensor of shape (B, input_dim, T).
- Return type: