normalization
- class symm_learning.nn.normalization.DataNorm(num_features: int, eps: float = 1e-06, only_centering: bool = False, compute_cov: bool = False, momentum: float = 1.0)[source]
Applies data normalization to a 2D or 3D tensor.
This module standardizes input data by centering (subtracting the mean) and optionally scaling (dividing by the standard deviation). The module supports multiple modes of operation controlled by its configuration parameters.
Mathematical Formulation:
The normalization is applied element-wise as:
\[\begin{split}y = \begin{cases} x - \mu & \text{if } \texttt{only centering} = \text{True} \\ \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} & \text{otherwise} \end{cases}\end{split}\]where \(\mu\) is the mean, \(\sigma^2\) is the variance, and \(\epsilon\) is a small constant for numerical stability.
Mode of Operation:
This layer features a non-standard behavior during training. Unlike typical normalization layers (e.g.,
torch.nn.BatchNorm1d) that normalize using batch statistics, this layer normalizes the data using the running statistics that have been updated with the current batch’s information.During training: 1. Batch statistics (\(\mu_{\text{batch}}\), \(\sigma^2_{\text{batch}}\)) are computed from the input. 2. Running statistics (\(\mu_{\text{run}}\), \(\sigma^2_{\text{run}}\)) are updated using exponential
- moving average:
\(\text{running stat} = (1-\alpha) \cdot \text{running stat} + \alpha \cdot \text{batch stat}\)
3. The input data is then normalized using these newly updated running statistics. This allows the loss to be dependent on the running statistics, with gradients flowing back through the batch statistics component of the update, but not into the historical state of the running statistics from previous steps.
During evaluation: Uses the final stored running statistics for normalization.
Special case: When
momentum=1.0, the layer effectively uses batch statistics for normalization, becoming equivalent to a torch.nn.BatchNorm1d layer with track_running_stats=False.- Parameters:
num_features (int) – Number of features or channels in the input tensor.
eps (float, optional) – Small constant added to the denominator for numerical stability. Only used when
only_centering=False. Default:1e-6.only_centering (bool, optional) – If
True, only centers the data (subtracts mean) without scaling by standard deviation. Default:False.compute_cov (bool, optional) – If
True, computes and tracks the full covariance matrix in addition to mean and variance. Accessible via thecovproperty. Default:False.momentum (float, optional) – Momentum factor for exponential moving average of running statistics. Must be greater than 0. Setting to
1.0effectively uses only batch statistics. Default:1.0.
- Shape:
Input: \((N, C)\) or \((N, C, L)\) where: - \(N\) is the batch size - \(C\) is the number of features (must equal
num_features) - \(L\) is the sequence length (optional, for 3D inputs)Output: Same shape as input
- running_mean
Running average of input means. Shape:
(num_features,).- Type:
torch.Tensor
- running_var
Running average of input variances. Shape:
(num_features,).- Type:
torch.Tensor
- running_cov
Running average of input covariance matrix. Shape:
(num_features, num_features). Only available whencompute_cov=True.- Type:
torch.Tensor
- num_batches_tracked
Number of batches processed during training.
- Type:
torch.Tensor
Note
When using 3D inputs \((N, C, L)\), statistics are computed over both the batch dimension \(N\) and sequence dimension \(L\), treating each feature channel independently.
- property cov: Tensor
Return the current covariance matrix estimate.
- extra_repr() str[source]
Return the extra representation of the module.
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- property mean: Tensor
Return the current mean estimate.
- property std: Tensor
Return the current std estimate (computed from variance).
- property var: Tensor
Return the current variance estimate.
- class symm_learning.nn.normalization.eBatchNorm1d(in_type: FieldType, eps: float = 1e-05, momentum: float = 0.1, affine: bool = True, track_running_stats: bool = True)[source]
Applies Batch Normalization over a 2D or 3D symmetric input
escnn.nn.GeometricTensor.Method described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .
\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]The mean and standard-deviation are calculated using symmetry-aware estimates (see
var_mean()) over the mini-batches and \(\gamma\) and \(\beta\) are the scale and bias vectors of aeAffine, which ensures that the affine transformation is symmetry-preserving. By default, the elements of \(\gamma\) are initialized to 1 and the elements of \(\beta\) are set to 0.Also by default, during training this layer keeps running estimates of its computed mean and variance, which are then used for normalization during evaluation. The running estimates are kept with a default
momentumof 0.1.If
track_running_statsis set toFalse, this layer then does not keep running estimates, and batch statistics are instead used during evaluation time as well.Note
If input tensor is of shape \((N, C, L)\), the implementation of this module computes a unique mean and variance for each feature or channel \(C\) and applies it to all the elements in the sequence length \(L\).
- Parameters:
input_type – the
escnn.nn.FieldTypeof the input geometric tensor. The output type is the same as the input type.eps – a value added to the denominator for numerical stability. Default: 1e-5
momentum – the value used for the running_mean and running_var computation. Can be set to
Nonefor cumulative moving average (i.e. simple average). Default: 0.1affine – a boolean value that when set to
True, this module has learnable affine parameters. Default:Truetrack_running_stats – a boolean value that when set to
True, this module tracks the running mean and variance, and when set toFalse, this module does not track such statistics, and initializes statistics buffersrunning_meanandrunning_varasNone. When these buffers areNone, this module always uses batch statistics. in both training and eval modes. Default:True
- Shape:
Input: \((N, C)\) or \((N, C, L)\), where \(N\) is the batch size, \(C\) is the number of features or channels, and \(L\) is the sequence length
Output: \((N, C)\) or \((N, C, L)\) (same shape as input)
- check_equivariance(atol=1e-05, rtol=1e-05)[source]
Check the equivariance of the convolution layer.
- evaluate_output_shape(input_shape)[source]
Compute the shape the output tensor which would be generated by this module when a tensor with shape
input_shapeis provided as input.- Parameters:
input_shape (tuple) – shape of the input tensor
- Returns:
shape of the output tensor
- extra_repr() str[source]
Return the extra representation of the module.
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- forward(x: GeometricTensor)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class symm_learning.nn.normalization.eDataNorm(in_type: FieldType, eps: float = 1e-06, only_centering: bool = False, compute_cov: bool = False, momentum: float = 1.0)[source]
Equivariant version of DataNorm using group-theoretic symmetry-aware statistics.
This module extends
DataNormto work with equivariant data by computing statistics that respect the symmetry structure defined by a group representation. It maintains the same API and modes of operation asDataNormwhile using symmetry-aware mean, variance, and covariance computations fromsymm_learning.stats.Mathematical Formulation:
The equivariant normalization follows the same mathematical form as
DataNorm:\[\begin{split}y = \begin{cases} x - \mu_{\text{equiv}} & \text{if } \texttt{only\_centering} = \text{True} \\ \frac{x - \mu_{\text{equiv}}}{\sqrt{\sigma^2_{\text{equiv}} + \epsilon}} & \text{otherwise} \end{cases}\end{split}\]However, the statistics \(\mu_{\text{equiv}}\) and \(\sigma^2_{\text{equiv}}\) are computed using symmetry-aware estimators:
Mean: Projected onto the \(G\)-invariant subspace
Variance: Constrained to be constant within each irreducible subspace
Covariance: Respects the block-diagonal structure imposed by the representation
Symmetry Properties:
The computed statistics satisfy equivariance and invariance properties:
\(\mathbb{E}[g \cdot x] = g \cdot \mathbb{E}[x]\) (mean equivariance)
\(\text{Var}[g \cdot x] = \text{Var}[x]\) (variance invariance)
\(\text{Cov}[g \cdot x, g \cdot y] = g \cdot \text{Cov}[x, y] \cdot g^T\) (covariance equivariance)
Input/Output Types:
Unlike
DataNormwhich operates on raw tensors,eDataNormprocessesescnn.nn.GeometricTensorobjects that encode the group representation information along with the tensor data.- Parameters:
in_type (escnn.nn.FieldType) – The field type defining the input’s group representation structure. The output type will be the same as the input type.
eps (float, optional) – Small constant added to the denominator for numerical stability. Only used when
only_centering=False. Default:1e-6.only_centering (bool, optional) – If
True, only centers the data using equivariant mean without scaling. Default:False.compute_cov (bool, optional) – If
True, computes and tracks the equivariant covariance matrix. Default:False.momentum (float, optional) – Momentum factor for exponential moving average of running statistics. Must be greater than 0. Setting to
1.0effectively uses only batch statistics. Default:1.0.
- Shape:
Input:
escnn.nn.GeometricTensorwith tensor shape \((N, D)\) or \((N, D, L)\) where: - \(N\) is the batch size - \(D\) isin_type.size(total representation dimension) - \(L\) is the sequence length (optional, for 3D inputs)Output:
escnn.nn.GeometricTensorwith the same type and shape as input
- export() DataNorm[source]
Exports the current state to a standard
DataNormlayer that can operate on raw tensors, transferring all learned statistics.
Examples
>>> from escnn import gspaces, nn as escnn_nn >>> from escnn.group import CyclicGroup >>> >>> # Define group and representation >>> G = CyclicGroup(4) >>> gspace = gspaces.no_base_space(G) >>> in_type = escnn_nn.FieldType(gspace, [G.regular_representation] * 2) >>> >>> # Create equivariant normalization layer >>> norm = eDataNorm(in_type=in_type, compute_cov=True) >>> >>> # Process equivariant data >>> x_tensor = torch.randn(16, in_type.size) # Raw tensor data >>> x_geom = in_type(x_tensor) # Wrap in GeometricTensor >>> y_geom = norm(x_geom) # Normalized GeometricTensor >>> >>> # Export to standard DataNorm >>> standard_norm = norm.export() >>> y_tensor = standard_norm(x_tensor) # Same result on raw tensor
Note
This layer inherits all modes of operation from
DataNorm(running statistics, fixed statistics, centering-only, covariance computation) while computing all statistics using group-theoretic constraints. The statistics respect the irreducible decomposition of the input representation, ensuring that symmetries are preserved throughout the normalization process.See also
DataNorm: The base normalization layer for standard (non-equivariant) data.symm_learning.stats.var_mean(): Equivariant mean and variance computation.symm_learning.stats.cov(): Equivariant covariance computation.- check_equivariance(atol=1e-05, rtol=1e-05)[source]
Check the equivariance of the normalization layer.