oneAPI Deep Neural Network Library (oneDNN)
Performance library for Deep Learning
1.96.0
Attributes

A container for parameters that extend primitives behavior. More...

Classes

struct  dnnl::post_ops
 Post-ops. More...
 
struct  dnnl::primitive_attr
 Primitive attributes. More...
 
struct  dnnl_primitive_attr
 An opaque structure for primitive descriptor attributes. More...
 
struct  dnnl_post_ops
 An opaque structure for a chain of post operations. More...
 

Typedefs

typedef struct dnnl_primitive_attrdnnl_primitive_attr_t
 A primitive descriptor attributes handle that controls primitive behavior. More...
 
typedef const struct dnnl_primitive_attrconst_dnnl_primitive_attr_t
 A constant primitive descriptor attributes handle.
 
typedef struct dnnl_post_opsdnnl_post_ops_t
 A post operation chain handle.
 
typedef const struct dnnl_post_opsconst_dnnl_post_ops_t
 A constant post operation chain handle.
 

Enumerations

enum  dnnl::scratchpad_mode
 Scratchpad mode. More...
 
enum  dnnl::prop_kind
 Propagation kind. More...
 
enum  dnnl::algorithm
 Kinds of algorithms. More...
 
enum  dnnl_scratchpad_mode_t
 Scratchpad mode. More...
 

Functions

dnnl_status_t DNNL_API dnnl_primitive_attr_create (dnnl_primitive_attr_t *attr)
 Creates an empty (default) primitive attributes with all the parameters set to their default values. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_clone (dnnl_primitive_attr_t *attr, const_dnnl_primitive_attr_t existing_attr)
 Clones primitive attributes. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_destroy (dnnl_primitive_attr_t attr)
 Destroys primitive attributes. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_scratchpad_mode (const_dnnl_primitive_attr_t attr, dnnl_scratchpad_mode_t *mode)
 Returns the primitive attributes scratchpad mode. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_scratchpad_mode (dnnl_primitive_attr_t attr, dnnl_scratchpad_mode_t mode)
 Sets primitive attributes scratchpad mode. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_output_scales (const_dnnl_primitive_attr_t attr, dnnl_dim_t *count, int *mask, const float **scales)
 Returns primitive attributes output scaling factors correspondence mask and values. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_output_scales (dnnl_primitive_attr_t attr, dnnl_dim_t count, int mask, const float *scales)
 Sets output scaling factors correspondence mask and values. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_scales (dnnl_primitive_attr_t attr, int arg, dnnl_dim_t *count, int *mask, const float **scales)
 Returns primitive attributes scaling factors correspondence mask and values for a given memory argument. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_scales (dnnl_primitive_attr_t attr, int arg, dnnl_dim_t count, int mask, const float *scales)
 Sets primitive attributes scaling factors for primitive operations for a given memory argument. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_zero_points (const_dnnl_primitive_attr_t attr, int arg, dnnl_dim_t *count, int *mask, const int32_t **zero_points)
 Returns count, correspondence zero point mask, and a pointer to a constant int32_t array of zero_points for given attr and memory argument (index), previously set by dnnl_primitive_attr_set_zero_points. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_zero_points (dnnl_primitive_attr_t attr, int arg, dnnl_dim_t count, int mask, const int32_t *zero_points)
 Sets primitive attributes zero points for primitive operations for a given memory argument. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_post_ops (const_dnnl_primitive_attr_t attr, const_dnnl_post_ops_t *post_ops)
 Returns primitive attributes post-ops. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_post_ops (dnnl_primitive_attr_t attr, const_dnnl_post_ops_t post_ops)
 Sets primitive attributes post-ops. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_create (dnnl_post_ops_t *post_ops)
 Creates empty post-ops sequence. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_destroy (dnnl_post_ops_t post_ops)
 Destroys post-ops. More...
 
int DNNL_API dnnl_post_ops_len (const_dnnl_post_ops_t post_ops)
 Returns the length of post-ops. More...
 
dnnl_primitive_kind_t DNNL_API dnnl_post_ops_get_kind (const_dnnl_post_ops_t post_ops, int index)
 Returns the kind of a post-op entry. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_append_sum (dnnl_post_ops_t post_ops, float scale)
 Appends an accumulation (sum) to post-ops. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_append_sum_v2 (dnnl_post_ops_t post_ops, float scale, dnnl_data_type_t data_type)
 Appends an accumulation v2 (sum) to post-ops. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_get_params_sum (const_dnnl_post_ops_t post_ops, int index, float *scale)
 Returns the parameters of an accumulation (sum) post-op. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_get_params_sum_v2 (const_dnnl_post_ops_t post_ops, int index, float *scale, dnnl_data_type_t *data_type)
 Returns the parameters of an accumulation (sum) post-op with a data type parameter. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_append_eltwise (dnnl_post_ops_t post_ops, float scale, dnnl_alg_kind_t alg_kind, float alpha, float beta)
 Appends an elementwise post-op. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_get_params_eltwise (const_dnnl_post_ops_t post_ops, int index, float *scale, dnnl_alg_kind_t *alg_kind, float *alpha, float *beta)
 Returns the parameters of an elementwise post-op. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_append_dw_k3s1p1 (dnnl_post_ops_t post_ops, dnnl_data_type_t weights_data_type, dnnl_data_type_t bias_data_type, dnnl_data_type_t dst_data_type, dnnl_dim_t count, int mask, const float *scales)
 Appends a depthwise post-op convolution with stride 1. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_get_params_dw_k3s1p1 (const_dnnl_post_ops_t post_ops, int index, dnnl_data_type_t *weights_data_type, dnnl_data_type_t *bias_data_type, dnnl_data_type_t *dst_data_type, dnnl_dim_t *count, int *mask, const float **scales)
 Returns the parameters of an depthwise post-op with stride 1. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_append_dw_k3s2p1 (dnnl_post_ops_t post_ops, dnnl_data_type_t weights_data_type, dnnl_data_type_t bias_data_type, dnnl_data_type_t dst_data_type, dnnl_dim_t count, int mask, const float *scales)
 Appends a depthwise post-op convolution with stride 2. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_get_params_dw_k3s2p1 (const_dnnl_post_ops_t post_ops, int index, dnnl_data_type_t *weights_data_type, dnnl_data_type_t *bias_data_type, dnnl_data_type_t *dst_data_type, dnnl_dim_t *count, int *mask, const float **scales)
 Returns the parameters of an depthwise post-op with stride 2. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_append_binary (dnnl_post_ops_t post_ops, dnnl_alg_kind_t alg_kind, const dnnl_memory_desc_t *src1_desc)
 Appends a binary post-op. More...
 
dnnl_status_t DNNL_API dnnl_post_ops_get_params_binary (const_dnnl_post_ops_t post_ops, int index, dnnl_alg_kind_t *alg_kind, const dnnl_memory_desc_t **src1_desc)
 Returns the parameters of a binary post-op. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_rnn_data_qparams (dnnl_primitive_attr_t attr, const float scale, const float shift)
 Set quantization scale and shift parameters for RNN data tensors. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_rnn_data_qparams (const_dnnl_primitive_attr_t attr, float *scale, float *shift)
 Returns the quantization scale and shift parameters for RNN data tensors. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_rnn_weights_qparams (dnnl_primitive_attr_t attr, dnnl_dim_t count, int mask, const float *scales)
 Sets quantization scaling factors for RNN weights tensors. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_rnn_weights_qparams (const_dnnl_primitive_attr_t attr, dnnl_dim_t *count, int *mask, const float **scales)
 Returns the quantization scaling factors for RNN weights tensors. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_set_rnn_weights_projection_qparams (dnnl_primitive_attr_t attr, dnnl_dim_t count, int mask, const float *scales)
 Sets quantization scaling factors for RNN projection weights tensors. More...
 
dnnl_status_t DNNL_API dnnl_primitive_attr_get_rnn_weights_projection_qparams (const_dnnl_primitive_attr_t attr, dnnl_dim_t *count, int *mask, const float **scales)
 Returns the quantization scaling factors for RNN projection weights tensors. More...
 
dnnl_scratchpad_mode_t dnnl::convert_to_c (scratchpad_mode mode)
 Converts a scratchpad mode enum value from C++ API to C API type. More...
 
dnnl_prop_kind_t dnnl::convert_to_c (prop_kind akind)
 Converts propagation kind enum value from C++ API to C API type. More...
 
dnnl_alg_kind_t dnnl::convert_to_c (algorithm aalgorithm)
 Converts algorithm kind enum value from C++ API to C API type. More...
 

Detailed Description

A container for parameters that extend primitives behavior.

Attributes can also contain Post-ops, which are computations executed after the primitive.

See also
Primitive Attributes
Primitive Attributes: Post-ops

Typedef Documentation

◆ dnnl_primitive_attr_t

A primitive descriptor attributes handle that controls primitive behavior.

Enumeration Type Documentation

◆ scratchpad_mode

enum dnnl::scratchpad_mode
strong

Scratchpad mode.

Enumerator
library 

The library manages the scratchpad allocation according to the policy specified by the DNNL_ENABLE_CONCURRENT_EXEC build option (default).

When DNNL_ENABLE_CONCURRENT_EXEC=OFF (default), the library scratchpad is common to all primitives to reduce the memory footprint. This configuration comes with limited thread-safety properties, namely primitives can be created and executed in parallel but cannot migrate between threads (in other words, each primitive should be executed in the same thread it was created in).

When DNNL_ENABLE_CONCURRENT_EXEC=ON, the library scratchpad is private to each primitive. The memory footprint is larger than when using DNNL_ENABLE_CONCURRENT_EXEC=OFF but different primitives can be created and run concurrently (the same primitive cannot be run concurrently from two different threads though).

user 

The user manages the scratchpad allocation by querying and providing the scratchpad memory to primitives.

This mode is thread-safe as long as the scratchpad buffers are not used concurrently by two primitive executions.

◆ prop_kind

enum dnnl::prop_kind
strong

Propagation kind.

Enumerator
undef 

Undefined propagation kind.

forward_training 

Forward data propagation (training mode).

In this mode, primitives perform computations necessary for subsequent backward propagation.

forward_inference 

Forward data propagation (inference mode).

In this mode, primitives perform only computations that are necessary for inference and omit computations that are necessary only for backward propagation.

forward_scoring 

Forward data propagation, alias for dnnl::prop_kind::forward_inference.

forward 

Forward data propagation, alias for dnnl::prop_kind::forward_training.

backward 

Backward propagation (with respect to all parameters).

backward_data 

Backward data propagation.

backward_weights 

Backward weights propagation.

backward_bias 

Backward bias propagation.

◆ algorithm

enum dnnl::algorithm
strong

Kinds of algorithms.

Enumerator
undef 

Undefined algorithm.

convolution_auto 

Convolution algorithm that is chosen to be either direct or Winograd automatically.

convolution_direct 

Direct convolution.

convolution_winograd 

Winograd convolution.

deconvolution_direct 

Direct deconvolution.

deconvolution_winograd 

Winograd deconvolution.

eltwise_relu 

Elementwise: rectified linear unit (ReLU)

eltwise_tanh 

Elementwise: hyperbolic tangent non-linearity (tanh)

eltwise_elu 

Elementwise: exponential linear unit (ELU)

eltwise_square 

Elementwise: square.

eltwise_abs 

Elementwise: abs.

eltwise_sqrt 

Elementwise: square root.

eltwise_swish 

Elementwise: swish ( \(x \cdot sigmoid(a \cdot x)\))

eltwise_linear 

Elementwise: linear.

eltwise_bounded_relu 

Elementwise: bounded_relu.

eltwise_soft_relu 

Elementwise: soft_relu.

eltwise_logistic 

Elementwise: logistic.

eltwise_exp 

Elementwise: exponent.

eltwise_gelu 

Elementwise: gelu alias for dnnl::algorithm::eltwise_gelu_tanh.

eltwise_gelu_tanh 

Elementwise: tanh-based gelu.

eltwise_gelu_erf 

Elementwise: erf-based gelu.

eltwise_log 

Elementwise: natural logarithm.

eltwise_clip 

Elementwise: clip.

eltwise_pow 

Elementwise: pow.

eltwise_round 

Elementwise: round.

eltwise_relu_use_dst_for_bwd 

Elementwise: rectified linar unit (ReLU) (dst for backward)

eltwise_tanh_use_dst_for_bwd 

Elementwise: hyperbolic tangent non-linearity (tanh) (dst for backward)

eltwise_elu_use_dst_for_bwd 

Elementwise: exponential linear unit (ELU) (dst for backward)

eltwise_sqrt_use_dst_for_bwd 

Elementwise: square root (dst for backward)

eltwise_logistic_use_dst_for_bwd 

Elementwise: logistic (dst for backward)

eltwise_exp_use_dst_for_bwd 

Elementwise: exponent (dst for backward)

lrn_across_channels 

Local response normalization (LRN) across multiple channels.

lrn_within_channel 

LRN within a single channel.

pooling_max 

Max pooling.

pooling_avg 

Average pooling exclude padding, alias for dnnl::algorithm::pooling_avg_include_padding.

pooling_avg_include_padding 

Average pooling include padding.

pooling_avg_exclude_padding 

Average pooling exclude padding.

vanilla_rnn 

RNN cell.

vanilla_lstm 

LSTM cell.

vanilla_gru 

GRU cell.

lbr_gru 

GRU cell with linear before reset.

Differs from the vanilla GRU in how the new memory gate is calculated: \(c_t = tanh(W_c*x_t + b_{c_x} + r_t*(U_c*h_{t-1}+b_{c_h})) \) LRB GRU expects 4 bias tensors on input: \([b_{u}, b_{r}, b_{c_x}, b_{c_h}]\)

binary_add 

Binary add.

binary_mul 

Binary mul.

binary_max 

Binary max.

binary_min 

Binary min.

binary_div 

Binary div.

binary_sub 

Binary sub.

resampling_nearest 

Nearest Neighbor resampling method.

resampling_linear 

Linear (Bilinear, Trilinear) resampling method.

reduction_max 

Reduction using max operation.

reduction_min 

Reduction using min operation.

reduction_sum 

Reduction using sum operation.

reduction_mul 

Reduction using mul operation.

reduction_mean 

Reduction using mean operation.

reduction_norm_lp_max 

Reduction using norm_lp_max operation.

reduction_norm_lp_sum 

Reduction using norm_lp_sum operation.

reduction_norm_lp_power_p_max 

Reduction using norm_lp_power_p_max operation.

reduction_norm_lp_power_p_sum 

Reduction using norm_lp_power_p_sum operation.

◆ dnnl_scratchpad_mode_t

Scratchpad mode.

Enumerator
dnnl_scratchpad_mode_library 

The library manages the scratchpad allocation according to the policy specified by the DNNL_ENABLE_CONCURRENT_EXEC build option (default).

When DNNL_ENABLE_CONCURRENT_EXEC=OFF (default), the library scratchpad is common to all primitives to reduce the memory footprint. This configuration comes with limited thread-safety properties, namely primitives can be created and executed in parallel but cannot migrate between threads (in other words, each primitive should be executed in the same thread it was created in).

When DNNL_ENABLE_CONCURRENT_EXEC=ON, the library scratchpad is private to each primitive. The memory footprint is larger than when using DNNL_ENABLE_CONCURRENT_EXEC=OFF but different primitives can be created and run concurrently (the same primitive cannot be run concurrently from two different threads though).

dnnl_scratchpad_mode_user 

The user manages the scratchpad allocation by querying and providing the scratchpad memory to primitives.

This mode is thread-safe as long as the scratchpad buffers are not used concurrently by two primitive executions.

Function Documentation

◆ dnnl_primitive_attr_create()

dnnl_status_t DNNL_API dnnl_primitive_attr_create ( dnnl_primitive_attr_t attr)

Creates an empty (default) primitive attributes with all the parameters set to their default values.

Empty attributes are implied whenever the respective argument is NULL.

Parameters
attrOutput primitive attributes.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_clone()

dnnl_status_t DNNL_API dnnl_primitive_attr_clone ( dnnl_primitive_attr_t attr,
const_dnnl_primitive_attr_t  existing_attr 
)

Clones primitive attributes.

Parameters
attrOutput primitive attributes.
existing_attrPrimitive attributes to clone.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_destroy()

dnnl_status_t DNNL_API dnnl_primitive_attr_destroy ( dnnl_primitive_attr_t  attr)

Destroys primitive attributes.

Parameters
attrPrimitive attributes to destroy.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_scratchpad_mode()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_scratchpad_mode ( const_dnnl_primitive_attr_t  attr,
dnnl_scratchpad_mode_t mode 
)

Returns the primitive attributes scratchpad mode.

Parameters
attrPrimitive attributes.
modeOutput scratchpad mode.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_set_scratchpad_mode()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_scratchpad_mode ( dnnl_primitive_attr_t  attr,
dnnl_scratchpad_mode_t  mode 
)

Sets primitive attributes scratchpad mode.

Parameters
attrPrimitive attributes.
modeScratchpad mode. The possible values are: dnnl_scratchpad_mode_library (default) and dnnl_scratchpad_mode_user.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_output_scales()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_output_scales ( const_dnnl_primitive_attr_t  attr,
dnnl_dim_t count,
int *  mask,
const float **  scales 
)

Returns primitive attributes output scaling factors correspondence mask and values.

Warning
The scales array is an internal part of the primitive attributes attr, so it is an error to modify or destroy the scales array.
The lifetime of scales array is the same as that of the primitive attributes attr to which it belongs, so it is an error to use scales after attr is destroyed.
Parameters
attrPrimitive attributes.
countOutput length of the array of scaling factors scales.
maskOutput scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common output scaling factor for the whole output tensor.
scalesOutput pointer to a constant array of scaling factors.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_set_output_scales()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_output_scales ( dnnl_primitive_attr_t  attr,
dnnl_dim_t  count,
int  mask,
const float *  scales 
)

Sets output scaling factors correspondence mask and values.

Note
The order of dimensions does not depend on how elements are laid out in memory. For example:
  • for a 2D CNN activations tensor the order is always (n, c)
  • for a 4D CNN activations tensor the order is always (n, c, h, w)
  • for a 5D CNN weights tensor the order is always (g, oc, ic, kh, kw)

Example usage:

int mb = 32, oc = 32, oh = 14, ow = 14; // convolution output params
float scales[oc] = { ... }; // unique output scales per output channel
int oc_dim = 1; // mb_dim = 0, channel_dim = 1, height_dim = 2, ...
dnnl_convolution_desc_t conv_d; // create a convolution descriptor
dnnl_primitive_attr_create(&attr); // create primitive attributes
dnnl_primitive_attr_set_output_scales(attr, oc, 1 << oc_dim, scales);
dnnl_primitive_desc_create(&conv_pd, &conv_d, attr, engine, NULL);
Parameters
attrPrimitive attributes.
countLength of the array of scaling factors scales.
maskScaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales array. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common output scaling factor for the whole output tensor.
scalesArray of output scaling factors. If the output scaling factors are known at the time of this call, this array must contain count values and the following equality must hold:

\[count = \prod\limits_{d \in mask} output.dims[d].\]

Violations can only be detected when the attributes are used to create a primitive descriptor. If the output scaling factors are not known at the time of the call, this array must contain a single DNNL_RUNTIME_F32_VAL value and the output scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_OUTPUT_SCALES.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_scales()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_scales ( dnnl_primitive_attr_t  attr,
int  arg,
dnnl_dim_t count,
int *  mask,
const float **  scales 
)

Returns primitive attributes scaling factors correspondence mask and values for a given memory argument.

Warning
The output scales array is an internal part of the primitive attributes attr, so it is an error to modify or destroy the scales array.
The lifetime of the scales array is the same as that of the primitive attributes attr to which it belongs, so it is an error to use scales after attr is destroyed.
Parameters
attrPrimitive attributes.
argParameter argument index as passed to the dnnl_primitive_execute() call.
countOutput length of the array of scaling factors scales.
maskOutput scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales array. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common scaling factor for the whole output tensor.
scalesOutput pointer to a constant array of float scaling factors.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_set_scales()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_scales ( dnnl_primitive_attr_t  attr,
int  arg,
dnnl_dim_t  count,
int  mask,
const float *  scales 
)

Sets primitive attributes scaling factors for primitive operations for a given memory argument.

See also
dnnl_primitive_attr_set_output_scales
Parameters
attrPrimitive attributes.
argParameter argument index as passed to the dnnl_primitive_execute() call.
countLength of the array of scaling factors scales.
maskScaling factors correspondence mask that defines the correspondence between the tensor dimensions and the scales array. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scalesConstant array of float scaling factors. This array must contain count scales and the following equality must hold:

\[count = \prod\limits_{d \in mask} output.dims[d].\]

Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_zero_points()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_zero_points ( const_dnnl_primitive_attr_t  attr,
int  arg,
dnnl_dim_t count,
int *  mask,
const int32_t **  zero_points 
)

Returns count, correspondence zero point mask, and a pointer to a constant int32_t array of zero_points for given attr and memory argument (index), previously set by dnnl_primitive_attr_set_zero_points.

Warning
The output zero_points array is an internal part of the primitive attributes attr, so it is an error to modify or destroy the zero_points array.
The lifetime of zero_points array is the same as that of the primitive attributes attr to which it belongs, so it is an error to use zero_points after attr is destroyed.
Parameters
attrPrimitive attributes.
argParameter argument index as passed to the dnnl_primitive_execute() call.
countOutput length of the array of zero points zero_points.
maskOutput zero points correspondence mask that defines the correspondence between the output tensor dimensions and the zero_points array. The set i-th bit indicates that a dedicated output zero point is used for each index along that dimension. The mask value of 0 implies a common zero point for the whole output tensor.
zero_pointsOutput pointer to a constant array of int32_t zero points.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_set_zero_points()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_zero_points ( dnnl_primitive_attr_t  attr,
int  arg,
dnnl_dim_t  count,
int  mask,
const int32_t *  zero_points 
)

Sets primitive attributes zero points for primitive operations for a given memory argument.

See also
dnnl_primitive_attr_set_output_scales
Parameters
attrPrimitive attributes.
argParameter argument index as passed to the dnnl_primitive_execute() call.
countLength of the array of zero points zero_points.
maskZero point correspondence mask that defines the correspondence between the tensor dimensions and the zero_points array. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor.
zero_pointsConstant array of int32_t zero points. If the zero points are known at the time of this call, this array must contain count zero points and the following equality must hold:

\[count = \prod\limits_{d \in mask} output.dims[d].\]

If the zero points are not known at the time of the call, this array must contain a single DNNL_RUNTIME_S32_VAL and the zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_post_ops()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_post_ops ( const_dnnl_primitive_attr_t  attr,
const_dnnl_post_ops_t post_ops 
)

Returns primitive attributes post-ops.

Warning
The output post_ops points to the internal attr field, so it is an error to modify or destroy them. The lifetime of post_ops is the same as that of the attr it belongs to, so it is an error to use post_ops after attr has been destroyed.
Parameters
attrPrimitive attributes.
post_opsOutput post-ops.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_set_post_ops()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_post_ops ( dnnl_primitive_attr_t  attr,
const_dnnl_post_ops_t  post_ops 
)

Sets primitive attributes post-ops.

Note
There is no way to check whether the post-ops would be supported by the target primitive. Any error will be reported by the dnnl_primitive_desc_create() function call.
Parameters
attrPrimitive attributes.
post_opsPost-ops to set.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_create()

dnnl_status_t DNNL_API dnnl_post_ops_create ( dnnl_post_ops_t post_ops)

Creates empty post-ops sequence.

Parameters
post_opsOutput post-ops.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_destroy()

dnnl_status_t DNNL_API dnnl_post_ops_destroy ( dnnl_post_ops_t  post_ops)

Destroys post-ops.

Parameters
post_opsPost-ops to destroy.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_len()

int DNNL_API dnnl_post_ops_len ( const_dnnl_post_ops_t  post_ops)

Returns the length of post-ops.

Parameters
post_opsPost-ops.
Returns
The number of post-ops entries.

◆ dnnl_post_ops_get_kind()

dnnl_primitive_kind_t DNNL_API dnnl_post_ops_get_kind ( const_dnnl_post_ops_t  post_ops,
int  index 
)

Returns the kind of a post-op entry.

Parameters
post_opsPost-ops.
indexPost-op entry index.
Returns
The kind of the post-op with the specified index.
dnnl_undefined_primitive if there is no post-op at the specified index.

◆ dnnl_post_ops_append_sum()

dnnl_status_t DNNL_API dnnl_post_ops_append_sum ( dnnl_post_ops_t  post_ops,
float  scale 
)

Appends an accumulation (sum) to post-ops.

Prior to accumulating the result, the previous value is multiplied by a scale.

The kind of this post-op is dnnl_sum.

This feature may improve performance for cases like residual learning blocks, where the result of convolution is accumulated to the previously computed activations. The parameter scale may be used for the integer-based computations when the result and previous activations have different logical scaling factors.

In the simplest case when the accumulation is the only post-op, the computations would be:

dst[:] <- scale * dst[:] + op(...) // instead of dst[:] <- op(...)
Note
This post-op executes in-place and does not change the destination layout.
Parameters
post_opsPost-ops.
scaleAccumulation scaling factor.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_append_sum_v2()

dnnl_status_t DNNL_API dnnl_post_ops_append_sum_v2 ( dnnl_post_ops_t  post_ops,
float  scale,
dnnl_data_type_t  data_type 
)

Appends an accumulation v2 (sum) to post-ops.

Prior to accumulating the result, the previous value is multiplied by a scale.

The kind of this post-op is dnnl_sum.

This feature may improve performance for cases like residual learning blocks, where the result of convolution is accumulated to the previously computed activations. The parameter scale may be used for the integer-based computations when the result and previous activations have different logical scaling factors.

In the simplest case when the accumulation is the only post-op, the computations would be:

dst[:] <- scale * dst[:] + op(...) // instead of dst[:] <- op(...)

If data_type is specified, original dst tensor will be reinterpreted as a tensor with provided data type. Since it is reinterpretation, data_type and dst data type should have same size. As a result, computations would be:

dst[:] <- scale * as_data_type(dst[:]) + op(...)
                                   // instead of dst[:] <- op(...)
Note
This post-op executes in-place and does not change the destination layout.
Parameters
post_opsPost-ops.
scaleAccumulation scaling factor.
data_typeAccumulation data_type.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_get_params_sum()

dnnl_status_t DNNL_API dnnl_post_ops_get_params_sum ( const_dnnl_post_ops_t  post_ops,
int  index,
float *  scale 
)

Returns the parameters of an accumulation (sum) post-op.

Parameters
post_opsPost-ops.
indexIndex of the sum post-op.
scaleOutput accumulation scaling factor.
Returns
dnnl_success on success and a status describing the error otherwise.
dnnl_invalid_arguments if index does not refer to a sum post-op.

◆ dnnl_post_ops_get_params_sum_v2()

dnnl_status_t DNNL_API dnnl_post_ops_get_params_sum_v2 ( const_dnnl_post_ops_t  post_ops,
int  index,
float *  scale,
dnnl_data_type_t data_type 
)

Returns the parameters of an accumulation (sum) post-op with a data type parameter.

Parameters
post_opsPost-ops.
indexIndex of the sum post-op.
scaleOutput accumulation scaling factor.
data_typeData type for accumulation.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_append_eltwise()

dnnl_status_t DNNL_API dnnl_post_ops_append_eltwise ( dnnl_post_ops_t  post_ops,
float  scale,
dnnl_alg_kind_t  alg_kind,
float  alpha,
float  beta 
)

Appends an elementwise post-op.

The kind of this post operation is dnnl_eltwise.

In the simplest case when the elementwise is the only post operation, the computations would be:

dst[:] <- scale * eltwise_op (op(...)) // instead of dst[:] <- op(...)

where eltwise_op is configured with the given parameters.

Parameters
post_opsPost-ops.
scaleScaling factor.
alg_kindElementwise algorithm for the post-op.
alphaAlpha parameter for the elementwise algorithm.
betaBeta parameter for the elementwise algorithm.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_get_params_eltwise()

dnnl_status_t DNNL_API dnnl_post_ops_get_params_eltwise ( const_dnnl_post_ops_t  post_ops,
int  index,
float *  scale,
dnnl_alg_kind_t alg_kind,
float *  alpha,
float *  beta 
)

Returns the parameters of an elementwise post-op.

Parameters
post_opsPost-ops.
indexIndex of the elementwise post-op.
scaleOutput scaling factor.
alg_kindOutput elementwise algorithm kind.
alphaOutput alpha parameter for the elementwise algorithm.
betaOutput beta parameter for the elementwise algorithm.
Returns
dnnl_success on success and a status describing the error otherwise.
dnnl_invalid_arguments if index does not refer to an elementwise post-op.

◆ dnnl_post_ops_append_dw_k3s1p1()

dnnl_status_t DNNL_API dnnl_post_ops_append_dw_k3s1p1 ( dnnl_post_ops_t  post_ops,
dnnl_data_type_t  weights_data_type,
dnnl_data_type_t  bias_data_type,
dnnl_data_type_t  dst_data_type,
dnnl_dim_t  count,
int  mask,
const float *  scales 
)

Appends a depthwise post-op convolution with stride 1.

This post-op can only be fused with a 2D 1x1 convolution (convolution with weights spatial dimension equal to 1 i.e., kh=kw=1).

The kind of this post-op is dnnl_convolution.

The number of outputs for primitive remain same as before. The output size remain same as the original primitive due to stride=1.

The Post-op can be defined as:

 dst[:] <- scales * (conv_dw(conv_1x1))

See dev_guide_attributes_post_ops_depthwise and dev_guide_attributes_post_ops_depthwise_fusion for more info.

Parameters
post_opsPost-ops.
weights_data_typeWeights data type of depthwise post-op
bias_data_typeBias data type of depthwise post-op
dst_data_typeOutput data type of depthwise post-op
countOutput length of the array of scaling factors scales.
maskOutput scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales array. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common scaling factor for the whole output tensor.
scalesOutput pointer to a constant array of float scaling factors.
Returns
dnnl_success on success and a status describing the error otherwise

◆ dnnl_post_ops_get_params_dw_k3s1p1()

dnnl_status_t DNNL_API dnnl_post_ops_get_params_dw_k3s1p1 ( const_dnnl_post_ops_t  post_ops,
int  index,
dnnl_data_type_t weights_data_type,
dnnl_data_type_t bias_data_type,
dnnl_data_type_t dst_data_type,
dnnl_dim_t count,
int *  mask,
const float **  scales 
)

Returns the parameters of an depthwise post-op with stride 1.

Parameters
post_opsPost-ops.
indexIndex of the elementwise post-op.
weights_data_typeWeights data type of depthwise post-op
bias_data_typeBias data type of depthwise post-op
dst_data_typeOutput data type of depthwise post-op
countOutput length of the array of scaling factors scales.
maskOutput scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales array. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common scaling factor for the whole output tensor.
scalesOutput pointer to a constant array of float scaling factors.
Returns
dnnl_success on success and a status describing the error otherwise

◆ dnnl_post_ops_append_dw_k3s2p1()

dnnl_status_t DNNL_API dnnl_post_ops_append_dw_k3s2p1 ( dnnl_post_ops_t  post_ops,
dnnl_data_type_t  weights_data_type,
dnnl_data_type_t  bias_data_type,
dnnl_data_type_t  dst_data_type,
dnnl_dim_t  count,
int  mask,
const float *  scales 
)

Appends a depthwise post-op convolution with stride 2.

This post-op can only be fused with a 2D 1x1 convolution (convolution with weights spatial dimension equal to 1 i.e., kh=kw=1).

The kind of this post-op is dnnl_convolution.

The number of outputs for primitive remain same as before. The output spatial size can be derived as below:

output_height = ceil(output_height_1x1_convolution, stride) output_width = ceil(output_width_1x1_convolution, stride)

The Post-op can be defined as:

 dst[:] <- scales * (conv_dw(conv_1x1))

See dev_guide_attributes_post_ops_depthwise and dev_guide_attributes_post_ops_depthwise_fusion for more info.

Parameters
post_opsPost-ops.
weights_data_typeWeights data type of depthwise post-op
bias_data_typeBias data type of depthwise post-op
dst_data_typeOutput data type of depthwise post-op
countOutput length of the array of scaling factors scales.
maskOutput scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales array. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common scaling factor for the whole output tensor.
scalesOutput pointer to a constant array of float scaling factors.
Returns
dnnl_success on success and a status describing the error otherwise

◆ dnnl_post_ops_get_params_dw_k3s2p1()

dnnl_status_t DNNL_API dnnl_post_ops_get_params_dw_k3s2p1 ( const_dnnl_post_ops_t  post_ops,
int  index,
dnnl_data_type_t weights_data_type,
dnnl_data_type_t bias_data_type,
dnnl_data_type_t dst_data_type,
dnnl_dim_t count,
int *  mask,
const float **  scales 
)

Returns the parameters of an depthwise post-op with stride 2.

Parameters
post_opsPost-ops.
indexIndex of the elementwise post-op.
weights_data_typeWeights data type of depthwise post-op
bias_data_typeBias data type of depthwise post-op
dst_data_typeOutput data type of depthwise post-op
countOutput length of the array of scaling factors scales.
maskOutput scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales array. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common scaling factor for the whole output tensor.
scalesOutput pointer to a constant array of float scaling factors.
Returns
dnnl_success on success and a status describing the error otherwise

◆ dnnl_post_ops_append_binary()

dnnl_status_t DNNL_API dnnl_post_ops_append_binary ( dnnl_post_ops_t  post_ops,
dnnl_alg_kind_t  alg_kind,
const dnnl_memory_desc_t src1_desc 
)

Appends a binary post-op.

The kind of this post operation is dnnl_binary.

In the simplest case when the binary is the only post operation, the computations would be:

dst[:] <- binary_op (dst[:], another_input[:])

where binary_op is configured with the given parameters. binary_op supports broadcast semantics for a second operand.

Parameters
post_opsPost-ops.
alg_kindBinary algorithm for the post-op.
src1_descMemory descriptor of a second operand.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_post_ops_get_params_binary()

dnnl_status_t DNNL_API dnnl_post_ops_get_params_binary ( const_dnnl_post_ops_t  post_ops,
int  index,
dnnl_alg_kind_t alg_kind,
const dnnl_memory_desc_t **  src1_desc 
)

Returns the parameters of a binary post-op.

Parameters
post_opsPost-ops.
indexIndex of the binary post-op.
alg_kindOutput binary algorithm kind.
src1_descOutput memory descriptor of a second operand.
Returns
dnnl_success on success and a status describing the error otherwise.
dnnl_invalid_arguments if index does not refer to a binary post-op.

◆ dnnl_primitive_attr_set_rnn_data_qparams()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_rnn_data_qparams ( dnnl_primitive_attr_t  attr,
const float  scale,
const float  shift 
)

Set quantization scale and shift parameters for RNN data tensors.

For performance reasons, the low-precision configuration of the RNN primitives expects input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.

The quantization formula is scale * data + shift.

Note
Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.

Example usage:

// RNN parameters
int l = 2, t = 2, mb = 32, sic = 32, slc = 32, dic = 32, dlc = 32;
// Activations quantization parameters
float scale = 63.f, shift = 64.f;
// Create default attributes
// Set scale and shift for int8 quantization of activation
// Create and configure rnn op_desc
dnnl_primitive_desc_create(&rnn_pd, &rnn_d, attr, engine, NULL);
Parameters
attrPrimitive attributes.
scaleThe value to scale the data by.
shiftThe value to shift the data by.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_rnn_data_qparams()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_rnn_data_qparams ( const_dnnl_primitive_attr_t  attr,
float *  scale,
float *  shift 
)

Returns the quantization scale and shift parameters for RNN data tensors.

Note
Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.
Parameters
attrPrimitive attributes.
scaleThe value to scale the data by.
shiftThe value to shift the data by.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_set_rnn_weights_qparams()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_rnn_weights_qparams ( dnnl_primitive_attr_t  attr,
dnnl_dim_t  count,
int  mask,
const float *  scales 
)

Sets quantization scaling factors for RNN weights tensors.

The low-precision configuration of the RNN primitives expects input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.

Note
The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.
Quantization scales are common for weights_layer and weights_iteration
Parameters
attrPrimitive attributes.
countNumber of elements in the scales array.
maskScaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scalesArray of output scaling factors that must contain count values and the following equality must hold:

\[count = \prod\limits_{d \in mask} weights.dims[d].\]

Violations can only be detected when the attributes are used to create a primitive descriptor.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_rnn_weights_qparams()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_rnn_weights_qparams ( const_dnnl_primitive_attr_t  attr,
dnnl_dim_t count,
int *  mask,
const float **  scales 
)

Returns the quantization scaling factors for RNN weights tensors.

Parameters
attrPrimitive attributes.
countNumber of elements in the scales array.
maskScaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scalesArray of output scaling factors that contain count values and the following equality must hold:

\[count = \prod\limits_{d \in mask} weights.dims[d].\]

Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_set_rnn_weights_projection_qparams()

dnnl_status_t DNNL_API dnnl_primitive_attr_set_rnn_weights_projection_qparams ( dnnl_primitive_attr_t  attr,
dnnl_dim_t  count,
int  mask,
const float *  scales 
)

Sets quantization scaling factors for RNN projection weights tensors.

The low-precision configuration of the RNN primitives expects input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.

Note
The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.
Parameters
attrPrimitive attributes.
countNumber of elements in the scales array.
maskScaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scalesArray of output scaling factors that must contain count values and the following equality must hold:

\[count = \prod\limits_{d \in mask} weights.dims[d].\]

Violations can only be detected when the attributes are used to create a primitive descriptor.
Returns
dnnl_success on success and a status describing the error otherwise.

◆ dnnl_primitive_attr_get_rnn_weights_projection_qparams()

dnnl_status_t DNNL_API dnnl_primitive_attr_get_rnn_weights_projection_qparams ( const_dnnl_primitive_attr_t  attr,
dnnl_dim_t count,
int *  mask,
const float **  scales 
)

Returns the quantization scaling factors for RNN projection weights tensors.

Parameters
attrPrimitive attributes.
countNumber of elements in the scales array.
maskScaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scalesArray of output scaling factors that contain count values and the following equality must hold:

\[count = \prod\limits_{d \in mask} weights.dims[d].\]

Returns
dnnl_success on success and a status describing the error otherwise.

◆ convert_to_c() [1/3]

dnnl_scratchpad_mode_t dnnl::convert_to_c ( scratchpad_mode  mode)
inline

Converts a scratchpad mode enum value from C++ API to C API type.

Parameters
modeC++ API scratchpad mode enum value.
Returns
Corresponding C API scratchpad mode enum value.

◆ convert_to_c() [2/3]

dnnl_prop_kind_t dnnl::convert_to_c ( prop_kind  akind)
inline

Converts propagation kind enum value from C++ API to C API type.

Parameters
akindC++ API propagation kind enum value.
Returns
Corresponding C API propagation kind enum value.

◆ convert_to_c() [3/3]

dnnl_alg_kind_t dnnl::convert_to_c ( algorithm  aalgorithm)
inline

Converts algorithm kind enum value from C++ API to C API type.

Parameters
aalgorithmC++ API algorithm kind enum value.
Returns
Corresponding C API algorithm kind enum value.