struct dnnl::primitive_attr¶
Overview¶
Primitive attributes. More…
#include <dnnl.hpp> struct primitive_attr: public dnnl::handle { // methods primitive_attr(); primitive_attr(dnnl_primitive_attr_t attr); fpmath_mode get_fpmath_mode() const; void set_fpmath_mode(fpmath_mode mode); scratchpad_mode get_scratchpad_mode() const; void set_scratchpad_mode(scratchpad_mode mode); void get_output_scales(int& mask, std::vector<float>& scales) const; void set_output_scales(int mask, const std::vector<float>& scales); void get_scales(int arg, int& mask, std::vector<float>& scales) const; void set_scales(int arg, int mask, const std::vector<float>& scales); void get_zero_points(int arg, int& mask, std::vector<int32_t>& zero_points) const; void set_zero_points(int arg, int mask, const std::vector<int32_t>& zero_points); const post_ops get_post_ops() const; void set_post_ops(const post_ops ops); void set_rnn_data_qparams(float scale, float shift); void get_rnn_data_qparams(float& scale, float& shift); void set_rnn_weights_qparams(int mask, const std::vector<float>& scales); void get_rnn_weights_qparams(int& mask, std::vector<float>& scales); void set_rnn_weights_projection_qparams( int mask, const std::vector<float>& scales ); void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales); };
Inherited Members¶
public: // methods handle(); handle(const handle<T, traits>&); handle<T, traits>& operator = (const handle<T, traits>&); handle(handle<T, traits>&&); handle<T, traits>& operator = (handle<T, traits>&&); handle(T t, bool weak = false); void reset(T t, bool weak = false); T get(bool allow_empty = false) const; operator T () const; operator bool () const; bool operator == (const handle<T, traits>& other) const; bool operator != (const handle& other) const;
Detailed Documentation¶
Primitive attributes.
See also:
Methods¶
primitive_attr()
Constructs default (empty) primitive attributes.
primitive_attr(dnnl_primitive_attr_t attr)
Creates primitive attributes from a C API dnnl_primitive_attr_t handle.
The resulting handle is not weak and the C handle will be destroyed during the destruction of the C++ object.
Parameters:
attr |
The C API primitive attributes. |
fpmath_mode get_fpmath_mode() const
Returns the fpmath mode.
void set_fpmath_mode(fpmath_mode mode)
Sets fpmath mode.
Parameters:
mode |
Specified fpmath mode. |
scratchpad_mode get_scratchpad_mode() const
Returns the scratchpad mode.
void set_scratchpad_mode(scratchpad_mode mode)
Sets scratchpad mode.
Parameters:
mode |
Specified scratchpad mode. |
void get_output_scales(int& mask, std::vector<float>& scales) const
Returns output scaling factors correspondence mask and values.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the |
scales |
Vector of output scaling factors. |
void set_output_scales(int mask, const std::vector<float>& scales)
Sets output scaling factors correspondence mask and values.
Example usage:
* int mb = 32, oc = 32, * oh = 14, ow = 14; // convolution output params * // unique output scales per output channel * vector<float> scales = { ... }; * int oc_dim = 1; // mb_dim = 0, channel_dim = 1, height_dim = 2, ... * * // construct a convolution descriptor * dnnl::convolution::desc conv_d; * * dnnl::primitive_attr attr; * attr.set_output_scales(attr, oc, 1 << oc_dim, scales); * * dnnl::primitive_desc conv_pd(conv_d, attr, engine); *
Note
The order of dimensions does not depend on how elements are laid out in memory. For example:
for a 2D CNN activations tensor the order is always (n, c)
for a 4D CNN activations tensor the order is always (n, c, h, w)
for a 5D CNN weights tensor the order is always (g, oc, ic, kh, kw)
Parameters:
mask |
Defines the correspondence between the output tensor dimensions and the |
scales |
Constant vector of output scaling factors. If the scaling factors are known at the time of this call, the following equality must hold: \(scales.size() = \prod\limits_{d \in mask} output.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. If the scaling factors are not known at the time of the call, this vector must contain a single DNNL_RUNTIME_F32_VAL value and the output scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_OUTPUT_SCALES. |
void get_scales(int arg, int& mask, std::vector<float>& scales) const
Returns scaling factors correspondence mask and values for a given memory argument.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the |
scales |
Output vector of scaling factors. |
void set_scales(int arg, int mask, const std::vector<float>& scales)
Sets scaling factors for primitive operations for a given memory argument.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Scaling factors correspondence mask that defines the correspondence between the tensor dimensions and the |
scales |
Constant vector of scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} argument.dims[d].\) |
See also:
dnnl_primitive_attr_set_scales
dnnl::primitive_attr::set_output_scales
void get_zero_points(int arg, int& mask, std::vector<int32_t>& zero_points) const
Returns zero points correspondence mask and values.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Zero points correspondence mask that defines the correspondence between the output tensor dimensions and the |
zero_points |
Output vector of zero points. |
void set_zero_points(int arg, int mask, const std::vector<int32_t>& zero_points)
Sets zero points for primitive operations for a given memory argument.
Parameters:
arg |
Parameter argument index as passed to the primitive::execute() call. |
mask |
Zero point correspondence mask that defines the correspondence between the tensor dimensions and the |
zero_points |
Constant vector of zero points. If the zero points are known at the time of this call, the following equality must hold: \(zero\_points.size() = \prod\limits_{d \in mask} argument.dims[d].\) If the zero points are not known at the time of the call, this vector must contain a single DNNL_RUNTIME_S32_VAL value and the zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS. |
See also:
dnnl_primitive_attr_set_zero_points
dnnl::primitive_attr::set_output_scales
const post_ops get_post_ops() const
Returns post-ops previously set via set_post_ops().
Returns:
Post-ops.
void set_post_ops(const post_ops ops)
Sets post-ops.
Note
There is no way to check whether the post-ops would be supported by the target primitive. Any error will be reported by the respective primitive descriptor constructor.
Parameters:
ops |
Post-ops object to copy post-ops from. |
void set_rnn_data_qparams(float scale, float shift)
Sets quantization scale and shift parameters for RNN data tensors.
For performance reasons, the low-precision configuration of the RNN primitives expect input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.
The quantization formula is scale * data + shift
.
Example usage:
* // RNN parameters * int l = 2, t = 2, mb = 32, sic = 32, slc = 32, dic = 32, dlc = 32; * // Activations quantization parameters * float scale = 63.f, shift = 64.f; * * primitive_attr attr; * * // Set scale and shift for int8 quantization of activation * attr.set_rnn_data_qparams(scale, shift); * * // Create and configure rnn op_desc * vanilla_rnn_forward::desc rnn_d(/* arguments */); * vanilla_rnn_forward::primitive_desc rnn_d(rnn_d, attr, engine); *
Note
Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.
Parameters:
scale |
The value to scale the data by. |
shift |
The value to shift the data by. |
void get_rnn_data_qparams(float& scale, float& shift)
Returns the quantization scale and shift parameters for RNN data tensors.
Note
Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.
Parameters:
scale |
The value to scale the data by. |
shift |
The value to shift the data by. |
void set_rnn_weights_qparams(int mask, const std::vector<float>& scales)
Sets quantization scaling factors for RNN weights tensors.
The low-precision configuration of the RNN primitives expect input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.
Note
The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.
Quantization scales are common for weights_layer and weights_iteration
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the |
scales |
Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
void get_rnn_weights_qparams(int& mask, std::vector<float>& scales)
Returns the quantization scaling factors for RNN projection weights tensors.
Note
The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the |
scales |
Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
void set_rnn_weights_projection_qparams( int mask, const std::vector<float>& scales )
Sets quantization scaling factors for RNN projection weights tensors.
passed to RNN primitives using attributes.
Note
The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.
Quantization scales are common for weights_layer and weights_iteration
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the |
scales |
Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales)
Returns the quantization scaling factors for RNN projection weights tensors.
Note
The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.
Parameters:
mask |
Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the |
scales |
Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |