Primitive attributes. More...
#include <dnnl.hpp>
Public Member Functions | |
primitive_attr () | |
Constructs default (empty) primitive attributes. | |
primitive_attr (dnnl_primitive_attr_t attr) | |
Creates primitive attributes from a C API dnnl_primitive_attr_t handle. More... | |
scratchpad_mode | get_scratchpad_mode () const |
Returns the scratchpad mode. | |
void | set_scratchpad_mode (scratchpad_mode mode) |
Sets scratchpad mode. More... | |
void | get_output_scales (int &mask, std::vector< float > &scales) const |
Returns output scaling factors correspondence mask and values. More... | |
void | set_output_scales (int mask, const std::vector< float > &scales) |
Sets output scaling factors correspondence mask and values. More... | |
void | get_scales (int arg, int &mask, std::vector< float > &scales) const |
Returns scaling factors correspondence mask and values for a given memory argument. More... | |
void | set_scales (int arg, int mask, const std::vector< float > &scales) |
Sets scaling factors for primitive operations for a given memory argument. More... | |
void | get_zero_points (int arg, int &mask, std::vector< int32_t > &zero_points) const |
Returns zero points correspondence mask and values. More... | |
void | set_zero_points (int arg, int mask, const std::vector< int32_t > &zero_points) |
Sets zero points for primitive operations for a given memory argument. More... | |
const post_ops | get_post_ops () const |
Returns post-ops previously set via set_post_ops(). More... | |
void | set_post_ops (const post_ops ops) |
Sets post-ops. More... | |
void | set_rnn_data_qparams (float scale, float shift) |
Sets quantization scale and shift parameters for RNN data tensors. More... | |
void | get_rnn_data_qparams (float &scale, float &shift) |
Returns the quantization scale and shift parameters for RNN data tensors. More... | |
void | set_rnn_weights_qparams (int mask, const std::vector< float > &scales) |
Sets quantization scaling factors for RNN weights tensors. More... | |
void | get_rnn_weights_qparams (int &mask, std::vector< float > &scales) |
Returns the quantization scaling factors for RNN projection weights tensors. More... | |
void | set_rnn_weights_projection_qparams (int mask, const std::vector< float > &scales) |
Sets quantization scaling factors for RNN projection weights tensors. More... | |
void | get_rnn_weights_projection_qparams (int &mask, std::vector< float > &scales) |
Returns the quantization scaling factors for RNN projection weights tensors. More... | |
Public Member Functions inherited from dnnl::handle< dnnl_primitive_attr_t > | |
bool | operator== (const handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &other) const |
Equality operator. More... | |
bool | operator!= (const handle &other) const |
Inequality operator. More... | |
handle ()=default | |
Constructs an empty handle object. More... | |
handle (const handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &)=default | |
Copy constructor. | |
handle (handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &&)=default | |
Move constructor. | |
handle (dnnl_primitive_attr_t t, bool weak=false) | |
Constructs a handle wrapper object from a C API handle. More... | |
handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > & | operator= (const handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &)=default |
Assignment operator. | |
handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > & | operator= (handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &&)=default |
Move assignment operator. | |
void | reset (dnnl_primitive_attr_t t, bool weak=false) |
Resets the handle wrapper objects to wrap a new C API handle. More... | |
dnnl_primitive_attr_t | get (bool allow_empty=false) const |
Returns the underlying C API handle. More... | |
operator dnnl_primitive_attr_t () const | |
Converts a handle to the underlying C API handle type. More... | |
operator bool () const | |
Checks whether the object is not empty. More... | |
Primitive attributes.
|
inline |
Creates primitive attributes from a C API dnnl_primitive_attr_t handle.
The resulting handle is not weak and the C handle will be destroyed during the destruction of the C++ object.
attr | The C API primitive attributes. |
|
inline |
Sets scratchpad mode.
mode | Specified scratchpad mode. |
|
inline |
Returns output scaling factors correspondence mask and values.
mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common output scaling factor for the whole output tensor. |
scales | Vector of output scaling factors. |
|
inline |
Sets output scaling factors correspondence mask and values.
Example usage:
mask | Defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common output scaling factor for the whole output tensor. |
scales | Constant vector of output scaling factors. If the scaling factors are known at the time of this call, the following equality must hold: \(scales.size() = \prod\limits_{d \in mask} output.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. If the scaling factors are not known at the time of the call, this vector must contain a single DNNL_RUNTIME_F32_VAL value and the output scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_OUTPUT_SCALES. |
|
inline |
Returns scaling factors correspondence mask and values for a given memory argument.
arg | Parameter argument index as passed to the primitive::execute() call. |
mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales | Output vector of scaling factors. |
|
inline |
Sets scaling factors for primitive operations for a given memory argument.
arg | Parameter argument index as passed to the primitive::execute() call. |
mask | Scaling factors correspondence mask that defines the correspondence between the tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales | Constant vector of scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} argument.dims[d].\) |
|
inline |
Returns zero points correspondence mask and values.
arg | Parameter argument index as passed to the primitive::execute() call. |
mask | Zero points correspondence mask that defines the correspondence between the output tensor dimensions and the zero_points vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor. |
zero_points | Output vector of zero points. |
|
inline |
Sets zero points for primitive operations for a given memory argument.
arg | Parameter argument index as passed to the primitive::execute() call. |
mask | Zero point correspondence mask that defines the correspondence between the tensor dimensions and the zero_points vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor. |
zero_points | Constant vector of zero points. If the zero points are known at the time of this call, the following equality must hold: \(zero\_points.size() = \prod\limits_{d \in mask} argument.dims[d].\) If the zero points are not known at the time of the call, this vector must contain a single DNNL_RUNTIME_S32_VAL value and the zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS. |
|
inline |
Returns post-ops previously set via set_post_ops().
|
inline |
Sets post-ops.
ops | Post-ops object to copy post-ops from. |
|
inline |
Sets quantization scale and shift parameters for RNN data tensors.
For performance reasons, the low-precision configuration of the RNN primitives expect input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.
The quantization formula is scale * data + shift
.
Example usage:
scale | The value to scale the data by. |
shift | The value to shift the data by. |
|
inline |
Returns the quantization scale and shift parameters for RNN data tensors.
scale | The value to scale the data by. |
shift | The value to shift the data by. |
|
inline |
Sets quantization scaling factors for RNN weights tensors.
The low-precision configuration of the RNN primitives expect input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.
mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
|
inline |
Returns the quantization scaling factors for RNN projection weights tensors.
mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
|
inline |
Sets quantization scaling factors for RNN projection weights tensors.
passed to RNN primitives using attributes.
mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
|
inline |
Returns the quantization scaling factors for RNN projection weights tensors.
mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |