DynamicQuantize¶
General¶
DynamicQuantize operation converts a f32 tensor to a quantized (s8 or u8) tensor. It supports both per-tensor and per-channel asymmetric linear quantization. The target quantized data type is specified via the data type of dst logical tensor. Rounding mode is library-implementation defined.
For per-tensor quantization
For per-channel quantization, taking channel axis = 1 as an example:
Operation attributes¶
Attribute Name |
Description |
Value Type |
Supported Values |
Required or Optional |
---|---|---|---|---|
Specifies which de-quantization type is used. |
string |
|
Optional |
|
Specifies dimension on which per-channel de-quantization is applied. |
s64 |
A s64 value in the range of [-r, r-1] where r = rank(src), |
Optional |
Execution arguments¶
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs¶
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
|
Required |
1 |
|
Required |
2 |
|
Optional |
Note
scales
is a f32 1D tensor to be applied to the quantization formula. For qtype
= per-tensor
, there should be only one element in the scales tensor. For qtype
= per-channel
, the element number should be equal to the element number of src tensor along the dimension axis.
Note
zps
is a 1D tensor with offset values that map to zero. For qtype
= per-tensor
, there should be only one element in the zps tensor. For qtype
= per-channel
, the element number should be equal to the element number of input tensor along the dimension axis. If omitted, zps values are assumed to be zero.
Outputs¶
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
|
Required |
Supported data types¶
DynamicQuantize operation supports the following data type combinations.
Src |
Scales |
Zps |
Dst |
---|---|---|---|
f32 |
f32 |
s8, u8, s32 |
s8 |
f32 |
f32 |
s8, u8, s32 |
u8 |