Layer Normalization

Overview

A primitive to perform layer normalization. More…

// structs

struct dnnl_layer_normalization_desc_t;
struct dnnl::layer_normalization_backward;
struct dnnl::layer_normalization_forward;

// global functions

dnnl_status_t DNNL_API dnnl_layer_normalization_forward_desc_init(
    dnnl_layer_normalization_desc_t* lnrm_desc,
    dnnl_prop_kind_t prop_kind,
    const dnnl_memory_desc_t* data_desc,
    const dnnl_memory_desc_t* stat_desc,
    float epsilon,
    unsigned flags
    );

dnnl_status_t DNNL_API dnnl_layer_normalization_backward_desc_init(
    dnnl_layer_normalization_desc_t* lnrm_desc,
    dnnl_prop_kind_t prop_kind,
    const dnnl_memory_desc_t* diff_data_desc,
    const dnnl_memory_desc_t* data_desc,
    const dnnl_memory_desc_t* stat_desc,
    float epsilon,
    unsigned flags
    );

Detailed Documentation

A primitive to perform layer normalization.

Normalization is performed within the last logical dimension of data tensor.

Both forward and backward propagation primitives support in-place operation; that is, src and dst can refer to the same memory for forward propagation, and diff_dst and diff_src can refer to the same memory for backward propagation.

The layer normalization primitives computations can be controlled by specifying different dnnl::normalization_flags values. For example, layer normalization forward propagation can be configured to either compute the mean and variance or take them as arguments. It can either perform scaling and shifting using gamma and beta parameters or not.

See also:

Layer Normalization in developer guide

Global Functions

dnnl_status_t DNNL_API dnnl_layer_normalization_forward_desc_init(
    dnnl_layer_normalization_desc_t* lnrm_desc,
    dnnl_prop_kind_t prop_kind,
    const dnnl_memory_desc_t* data_desc,
    const dnnl_memory_desc_t* stat_desc,
    float epsilon,
    unsigned flags
    )

Initializes a descriptor for layer normalization forward propagation primitive.

Note

In-place operation is supported: the dst can refer to the same memory as the src.

Parameters:

lnrm_desc

Output descriptor for layer normalization primitive.

prop_kind

Propagation kind. Possible values are dnnl_forward_training and dnnl_forward_inference.

data_desc

Source and destination memory descriptor.

stat_desc

Memory descriptor for mean and variance. If this parameter is NULL, a zero memory descriptor, or a memory descriptor with format_kind set to dnnl_format_kind_undef, then the memory descriptor for stats is derived from data_desc by removing the last dimension.

epsilon

Layer normalization epsilon parameter.

flags

Layer normalization flags (dnnl_normalization_flags_t).

Returns:

dnnl_success on success and a status describing the error otherwise.

dnnl_status_t DNNL_API dnnl_layer_normalization_backward_desc_init(
    dnnl_layer_normalization_desc_t* lnrm_desc,
    dnnl_prop_kind_t prop_kind,
    const dnnl_memory_desc_t* diff_data_desc,
    const dnnl_memory_desc_t* data_desc,
    const dnnl_memory_desc_t* stat_desc,
    float epsilon,
    unsigned flags
    )

Initializes a descriptor for a layer normalization backward propagation primitive.

Note

In-place operation is supported: the diff_dst can refer to the same memory as the diff_src.

Parameters:

lnrm_desc

Output descriptor for layer normalization primitive.

prop_kind

Propagation kind. Possible values are dnnl_backward_data and dnnl_backward (diffs for all parameters are computed in this case).

diff_data_desc

Diff source and diff destination memory descriptor.

data_desc

Source memory descriptor.

stat_desc

Memory descriptor for mean and variance. If this parameter is NULL, a zero memory descriptor, or a memory descriptor with format_kind set to dnnl_format_kind_undef, then the memory descriptor for stats is derived from data_desc by removing the last dimension.

epsilon

Layer normalization epsilon parameter.

flags

Layer normalization flags (dnnl_normalization_flags_t).

Returns:

dnnl_success on success and a status describing the error otherwise.