Dequantize¶
General¶
Dequantize operation converts a quantized (u8/s8/f8_e4m3/f8_e5m2) tensor to an f32 tensor. It supports both per-tensor and per-channel asymmetric linear de-quantization. Rounding mode is library-implementation defined. Zero points (zps
in the attribute table) are not supported for f8_e4m3 and f8_e5m2 dequantization.
For per-tensor de-quantization:
For per-channel de-quantization, taking channel axis = 1 as an example:
where \(ic\) is the number of channels.
Operation attributes¶
Attribute Name |
Description |
Value Type |
Supported Values |
Required or Optional |
---|---|---|---|---|
Specifies which de-quantization type is used. |
string |
|
Optional |
|
Specifies dimension on which per-channel de-quantization is applied. |
s64 |
A s64 value in the range of [-r, r-1] where r = rank(src), |
Optional |
|
Scalings applied on the src data. |
f32 |
A f32 list (only contain one element if qtype is |
Required |
|
Offset values that maps to float zero. |
s64 |
A s64 list (only contain one element if qtype is |
Optional |
Execution arguments¶
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs¶
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
|
Required |
Outputs¶
Index |
Argument Name |
Required or Optional |
---|---|---|
0 |
|
Required |
Supported data types¶
Dequantize operation supports the following data type combinations.
Src |
Dst |
---|---|
s8, u8, f8_e4m3, f8_e5m2 |
f32 |
Note
This operation is to support int8 quantization model.