oneAPI Deep Neural Network Library (oneDNN)
Performance library for Deep Learning
1.96.0
BLAS functions

A subset of Basic Linear Algebra (BLAS) functions that perform matrix-matrix multiplication. More...

Functions

dnnl_status_t DNNL_API dnnl_sgemm (char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float *A, dnnl_dim_t lda, const float *B, dnnl_dim_t ldb, float beta, float *C, dnnl_dim_t ldc)
 Performs single-precision matrix-matrix multiply. More...
 
dnnl_status_t DNNL_API dnnl_gemm_u8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t *A, dnnl_dim_t lda, uint8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co)
 Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More...
 
dnnl_status_t DNNL_API dnnl_gemm_s8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t *A, dnnl_dim_t lda, int8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co)
 Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More...
 
status dnnl::sgemm (char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float *A, dnnl_dim_t lda, const float *B, dnnl_dim_t ldb, float beta, float *C, dnnl_dim_t ldc)
 Performs single-precision matrix-matrix multiply. More...
 
status dnnl::gemm_u8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t *A, dnnl_dim_t lda, uint8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co)
 Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More...
 
status dnnl::gemm_s8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t *A, dnnl_dim_t lda, int8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co)
 Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More...
 

Detailed Description

A subset of Basic Linear Algebra (BLAS) functions that perform matrix-matrix multiplication.

Function Documentation

◆ dnnl_sgemm()

dnnl_status_t DNNL_API dnnl_sgemm ( char  transa,
char  transb,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const float *  A,
dnnl_dim_t  lda,
const float *  B,
dnnl_dim_t  ldb,
float  beta,
float *  C,
dnnl_dim_t  ldc 
)

Performs single-precision matrix-matrix multiply.

The operation is defined as:

C := alpha * op( A ) * op( B ) + beta * C

where

  • op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars, and
  • A, B, and C are matrices:
    • op( A ) is an MxK matrix,
    • op( B ) is an KxN matrix,
    • C is an MxN matrix.

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Parameters
transaTransposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transbTransposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
MThe M dimension.
NThe N dimension.
KThe K dimension.
alphaThe alpha parameter that is used to scale the product of matrices A and B.
AA pointer to the A matrix data.
ldaThe leading dimension for the matrix A.
BA pointer to the B matrix data.
ldbThe leading dimension for the matrix B.
betaThe beta parameter that is used to scale the matrix C.
CA pointer to the C matrix data.
ldcThe leading dimension for the matrix C.
Returns
dnnl_success/dnnl::status::success on success and a status describing the error otherwise.
Examples:
cpu_rnn_inference_f32.cpp, cpu_rnn_inference_int8.cpp, and cpu_sgemm_and_matmul.cpp.

◆ dnnl_gemm_u8s8s32()

dnnl_status_t DNNL_API dnnl_gemm_u8s8s32 ( char  transa,
char  transb,
char  offsetc,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const uint8_t *  A,
dnnl_dim_t  lda,
uint8_t  ao,
const int8_t *  B,
dnnl_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
dnnl_dim_t  ldc,
const int32_t *  co 
)

Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.

The operation is defined as:

C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset

where

  • op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars, and
  • A, B, and C are matrices:
    • op( A ) is an MxK matrix,
    • op( B ) is an KxN matrix,
    • C is an MxN matrix.
  • A_offset is an MxK matrix with every element equal the ao value,
  • B_offset is an KxN matrix with every element equal the bo value,
  • C_offset is an MxN matrix which is defined by the co array of size len:
    • if offsetc = F: the len must be at least 1,
    • if offsetc = C: the len must be at least max(1, m),
    • if offsetc = R: the len must be at least max(1, n),

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters
transaTransposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transbTransposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
offsetcFlag specifying how offsets should be applied to matrix C:
  • 'F' means that the same offset will be applied to each element of the matrix C,
  • 'C' means that individual offset will be applied to each element within each column,
  • 'R' means that individual offset will be applied to each element within each row.
MThe M dimension.
NThe N dimension.
KThe K dimension.
alphaThe alpha parameter that is used to scale the product of matrices A and B.
AA pointer to the A matrix data.
ldaThe leading dimension for the matrix A.
aoThe offset value for the matrix A.
BA pointer to the B matrix data.
ldbThe leading dimension for the matrix B.
boThe offset value for the matrix B.
betaThe beta parameter that is used to scale the matrix C.
CA pointer to the C matrix data.
ldcThe leading dimension for the matrix C.
coAn array of offset values for the matrix C. The number of elements in the array depends on the value of offsetc.
Returns
dnnl_success/dnnl::status::success on success and a status describing the error otherwise.
Examples:
cpu_rnn_inference_int8.cpp.

◆ dnnl_gemm_s8s8s32()

dnnl_status_t DNNL_API dnnl_gemm_s8s8s32 ( char  transa,
char  transb,
char  offsetc,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const int8_t *  A,
dnnl_dim_t  lda,
int8_t  ao,
const int8_t *  B,
dnnl_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
dnnl_dim_t  ldc,
const int32_t *  co 
)

Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.

The operation is defined as:

C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset

where

  • op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars, and
  • A, B, and C are matrices:
    • op( A ) is an MxK matrix,
    • op( B ) is an KxN matrix,
    • C is an MxN matrix.
  • A_offset is an MxK matrix with every element equal the ao value,
  • B_offset is an KxN matrix with every element equal the bo value,
  • C_offset is an MxN matrix which is defined by the co array of size len:
    • if offsetc = F: the len must be at least 1,
    • if offsetc = C: the len must be at least max(1, m),
    • if offsetc = R: the len must be at least max(1, n),

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters
transaTransposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transbTransposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
offsetcFlag specifying how offsets should be applied to matrix C:
  • 'F' means that the same offset will be applied to each element of the matrix C,
  • 'C' means that individual offset will be applied to each element within each column,
  • 'R' means that individual offset will be applied to each element within each row.
MThe M dimension.
NThe N dimension.
KThe K dimension.
alphaThe alpha parameter that is used to scale the product of matrices A and B.
AA pointer to the A matrix data.
ldaThe leading dimension for the matrix A.
aoThe offset value for the matrix A.
BA pointer to the B matrix data.
ldbThe leading dimension for the matrix B.
boThe offset value for the matrix B.
betaThe beta parameter that is used to scale the matrix C.
CA pointer to the C matrix data.
ldcThe leading dimension for the matrix C.
coAn array of offset values for the matrix C. The number of elements in the array depends on the value of offsetc.
Returns
dnnl_success/dnnl::status::success on success and a status describing the error otherwise.

◆ sgemm()

status dnnl::sgemm ( char  transa,
char  transb,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const float *  A,
dnnl_dim_t  lda,
const float *  B,
dnnl_dim_t  ldb,
float  beta,
float *  C,
dnnl_dim_t  ldc 
)
inline

Performs single-precision matrix-matrix multiply.

The operation is defined as:

C := alpha * op( A ) * op( B ) + beta * C

where

  • op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars, and
  • A, B, and C are matrices:
    • op( A ) is an MxK matrix,
    • op( B ) is an KxN matrix,
    • C is an MxN matrix.

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Parameters
transaTransposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transbTransposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
MThe M dimension.
NThe N dimension.
KThe K dimension.
alphaThe alpha parameter that is used to scale the product of matrices A and B.
AA pointer to the A matrix data.
ldaThe leading dimension for the matrix A.
BA pointer to the B matrix data.
ldbThe leading dimension for the matrix B.
betaThe beta parameter that is used to scale the matrix C.
CA pointer to the C matrix data.
ldcThe leading dimension for the matrix C.
Returns
dnnl_success/dnnl::status::success on success and a status describing the error otherwise.

◆ gemm_u8s8s32()

status dnnl::gemm_u8s8s32 ( char  transa,
char  transb,
char  offsetc,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const uint8_t *  A,
dnnl_dim_t  lda,
uint8_t  ao,
const int8_t *  B,
dnnl_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
dnnl_dim_t  ldc,
const int32_t *  co 
)
inline

Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.

The operation is defined as:

C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset

where

  • op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars, and
  • A, B, and C are matrices:
    • op( A ) is an MxK matrix,
    • op( B ) is an KxN matrix,
    • C is an MxN matrix.
  • A_offset is an MxK matrix with every element equal the ao value,
  • B_offset is an KxN matrix with every element equal the bo value,
  • C_offset is an MxN matrix which is defined by the co array of size len:
    • if offsetc = F: the len must be at least 1,
    • if offsetc = C: the len must be at least max(1, m),
    • if offsetc = R: the len must be at least max(1, n),

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters
transaTransposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transbTransposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
offsetcFlag specifying how offsets should be applied to matrix C:
  • 'F' means that the same offset will be applied to each element of the matrix C,
  • 'C' means that individual offset will be applied to each element within each column,
  • 'R' means that individual offset will be applied to each element within each row.
MThe M dimension.
NThe N dimension.
KThe K dimension.
alphaThe alpha parameter that is used to scale the product of matrices A and B.
AA pointer to the A matrix data.
ldaThe leading dimension for the matrix A.
aoThe offset value for the matrix A.
BA pointer to the B matrix data.
ldbThe leading dimension for the matrix B.
boThe offset value for the matrix B.
betaThe beta parameter that is used to scale the matrix C.
CA pointer to the C matrix data.
ldcThe leading dimension for the matrix C.
coAn array of offset values for the matrix C. The number of elements in the array depends on the value of offsetc.
Returns
dnnl_success/dnnl::status::success on success and a status describing the error otherwise.

◆ gemm_s8s8s32()

status dnnl::gemm_s8s8s32 ( char  transa,
char  transb,
char  offsetc,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const int8_t *  A,
dnnl_dim_t  lda,
int8_t  ao,
const int8_t *  B,
dnnl_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
dnnl_dim_t  ldc,
const int32_t *  co 
)
inline

Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.

The operation is defined as:

C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset

where

  • op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars, and
  • A, B, and C are matrices:
    • op( A ) is an MxK matrix,
    • op( B ) is an KxN matrix,
    • C is an MxN matrix.
  • A_offset is an MxK matrix with every element equal the ao value,
  • B_offset is an KxN matrix with every element equal the bo value,
  • C_offset is an MxN matrix which is defined by the co array of size len:
    • if offsetc = F: the len must be at least 1,
    • if offsetc = C: the len must be at least max(1, m),
    • if offsetc = R: the len must be at least max(1, n),

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters
transaTransposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transbTransposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
offsetcFlag specifying how offsets should be applied to matrix C:
  • 'F' means that the same offset will be applied to each element of the matrix C,
  • 'C' means that individual offset will be applied to each element within each column,
  • 'R' means that individual offset will be applied to each element within each row.
MThe M dimension.
NThe N dimension.
KThe K dimension.
alphaThe alpha parameter that is used to scale the product of matrices A and B.
AA pointer to the A matrix data.
ldaThe leading dimension for the matrix A.
aoThe offset value for the matrix A.
BA pointer to the B matrix data.
ldbThe leading dimension for the matrix B.
boThe offset value for the matrix B.
betaThe beta parameter that is used to scale the matrix C.
CA pointer to the C matrix data.
ldcThe leading dimension for the matrix C.
coAn array of offset values for the matrix C. The number of elements in the array depends on the value of offsetc.
Returns
dnnl_success/dnnl::status::success on success and a status describing the error otherwise.