LayerNorm¶
General¶
LayerNorm performs a layer normalization operation on \(\src\) tensor.
The layerNorm operation performs normalization from begin_norm_axis
to last dimension of the data tensor. It is defined by the following formulas which is the same as Layer Normalization.
where
\(\gamma(c), \beta(c)\) are optional scale and shift for a channel
\(\mu(t, n), \sigma^2(t, n)\) are mean and variance (see
\(\epsilon\) is a constant to improve numerical stability.
Mean and variance are computed at runtime or provided by a user. When mean and variance are computed at runtime, the following formulas are used:
\(\mu(t, n) = \frac{1}{C} \sum\limits_{c} \src(t, n, c)_{}\),
\(\sigma^2(t, n) = \frac{1}{C} \sum\limits_{c} {}_{} (\src(t, n, c) - \mu(t, n))^2\).
Operation attributes¶
Attribute Name |
Description |
Value Type |
Supported Values |
Required or Optional |
---|---|---|---|---|
Indicate whether to output mean and variance which can be later passed to backward op. |
bool |
|
Optional |
|
|
s64 |
[-r,r-1],where r=rank(src). -1 is default |
Optional |
|
When set to True, this module has learnable per-element affine parameters. |
bool |
|
Optional |
|
The constant to improve numerical stability. |
f32 |
Arbitrary positive f32 value, |
Optional |
Execution arguments¶
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs¶
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
|
Required |
1 |
|
Optional |
2 |
|
Optional |
Note
gamma
is scaling for normalized value. beta
is the bias added to the scaled normalized value. They are both 1D tensor with the same span as src’s channel axis and required if attribute use_affine
is set to True.
Outputs¶
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
|
Required |
1 |
|
Optional |
2 |
|
Optional |
Note
Both mean
and variance
are required if attribute keep_stats
is set to True.
Supported data types¶
LayerNorm operation supports the following data type combinations.
Src / Dst |
Gamma / Beta / Mean / Variance |
---|---|
f32 |
f32 |
bf16 |
f32, bf16 |
f16 |
f32 |