Layer Normalization¶
Overview¶
A primitive to perform layer normalization. More…
// structs struct dnnl_layer_normalization_desc_t; struct dnnl::layer_normalization_backward; struct dnnl::layer_normalization_forward; // global functions dnnl_status_t DNNL_API dnnl_layer_normalization_forward_desc_init( dnnl_layer_normalization_desc_t* lnrm_desc, dnnl_prop_kind_t prop_kind, const dnnl_memory_desc_t* data_desc, const dnnl_memory_desc_t* stat_desc, float epsilon, unsigned flags ); dnnl_status_t DNNL_API dnnl_layer_normalization_backward_desc_init( dnnl_layer_normalization_desc_t* lnrm_desc, dnnl_prop_kind_t prop_kind, const dnnl_memory_desc_t* diff_data_desc, const dnnl_memory_desc_t* data_desc, const dnnl_memory_desc_t* stat_desc, float epsilon, unsigned flags );
Detailed Documentation¶
A primitive to perform layer normalization.
Normalization is performed within the last logical dimension of data tensor.
Both forward and backward propagation primitives support in-place operation; that is, src and dst can refer to the same memory for forward propagation, and diff_dst and diff_src can refer to the same memory for backward propagation.
The layer normalization primitives computations can be controlled by specifying different dnnl::normalization_flags values. For example, layer normalization forward propagation can be configured to either compute the mean and variance or take them as arguments. It can either perform scaling and shifting using gamma and beta parameters or not. Optionally, it can also perform a fused ReLU, which in case of training would also require a workspace.
See also:
Layer Normalization in developer guide
Global Functions¶
dnnl_status_t DNNL_API dnnl_layer_normalization_forward_desc_init( dnnl_layer_normalization_desc_t* lnrm_desc, dnnl_prop_kind_t prop_kind, const dnnl_memory_desc_t* data_desc, const dnnl_memory_desc_t* stat_desc, float epsilon, unsigned flags )
Initializes a descriptor for layer normalization forward propagation primitive.
Note
In-place operation is supported: the dst can refer to the same memory as the src.
Parameters:
lnrm_desc |
Output descriptor for layer normalization primitive. |
prop_kind |
Propagation kind. Possible values are dnnl_forward_training and dnnl_forward_inference. |
data_desc |
Source and destination memory descriptor. |
stat_desc |
Memory descriptor for mean and variance. If this parameter is NULL, a zero memory descriptor, or a memory descriptor with format_kind set to dnnl_format_kind_undef, then the memory descriptor for stats is derived from |
epsilon |
Layer normalization epsilon parameter. |
flags |
Layer normalization flags (dnnl_normalization_flags_t). |
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_layer_normalization_backward_desc_init( dnnl_layer_normalization_desc_t* lnrm_desc, dnnl_prop_kind_t prop_kind, const dnnl_memory_desc_t* diff_data_desc, const dnnl_memory_desc_t* data_desc, const dnnl_memory_desc_t* stat_desc, float epsilon, unsigned flags )
Initializes a descriptor for a layer normalization backward propagation primitive.
Note
In-place operation is supported: the diff_dst can refer to the same memory as the diff_src.
Parameters:
lnrm_desc |
Output descriptor for layer normalization primitive. |
prop_kind |
Propagation kind. Possible values are dnnl_backward_data and dnnl_backward (diffs for all parameters are computed in this case). |
diff_data_desc |
Diff source and diff destination memory descriptor. |
data_desc |
Source memory descriptor. |
stat_desc |
Memory descriptor for mean and variance. If this parameter is NULL, a zero memory descriptor, or a memory descriptor with format_kind set to dnnl_format_kind_undef, then the memory descriptor for stats is derived from |
epsilon |
Layer normalization epsilon parameter. |
flags |
Layer normalization flags (dnnl_normalization_flags_t). |
Returns:
dnnl_success on success and a status describing the error otherwise.