Deep Neural Network Library (DNNL)  1.1.3
Performance library for Deep Learning
Functions
BLAS functions

A subset of Basic Linear ALgebra (BLAS) functions to perform matrix-matrix multiplication. More...

Functions

dnnl_status_t DNNL_API dnnl_sgemm (char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float *A, dnnl_dim_t lda, const float *B, dnnl_dim_t ldb, float beta, float *C, dnnl_dim_t ldc)
 SGEMM performs a matrix-matrix multiplication operation defined as. More...
 
dnnl_status_t DNNL_API dnnl_gemm_u8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t *A, dnnl_dim_t lda, uint8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co)
 dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product. More...
 
dnnl_status_t DNNL_API dnnl_gemm_s8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t *A, dnnl_dim_t lda, int8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co)
 dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product. More...
 

Detailed Description

A subset of Basic Linear ALgebra (BLAS) functions to perform matrix-matrix multiplication.

Function Documentation

◆ dnnl_sgemm()

dnnl_status_t DNNL_API dnnl_sgemm ( char  transa,
char  transb,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const float *  A,
dnnl_dim_t  lda,
const float *  B,
dnnl_dim_t  ldb,
float  beta,
float *  C,
dnnl_dim_t  ldc 
)

SGEMM performs a matrix-matrix multiplication operation defined as.

C := alpha*op( A )*op( B ) + beta*C

where

  • op( X ) is one of op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars,
  • A, B and C are matrices, with op( A ) an m by k matrix, op( B ) a k by n matrix and C an m by n matrix.

The matrices are assumed to be stored in row-major order (the elements in a matrix rows are contiguous in memory).

Note
The API is different from the standard BLAS routine because it returns dnnl_status_t for error handling. XERBLA is not supported: no error message will be printed in case of incorrect parameters.
Examples:
cpu_rnn_inference_f32.cpp, and cpu_rnn_inference_int8.cpp.

◆ dnnl_gemm_u8s8s32()

dnnl_status_t DNNL_API dnnl_gemm_u8s8s32 ( char  transa,
char  transb,
char  offsetc,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const uint8_t *  A,
dnnl_dim_t  lda,
uint8_t  ao,
const int8_t *  B,
dnnl_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
dnnl_dim_t  ldc,
const int32_t *  co 
)

dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product.

For the final result, a vector is added to each row or column of the output matrix.

The operation is defined as:

  • C := alpha*(op(A) - A_offset) * (op(B) - B_offset) + beta*C + C_offset

where

  • op( X ) = X or op( X ) = X**T,
  • A_offset is an m-by-k matrix with every element equal to the value ao,
  • B_offset is an k-by-n matrix with every element equal to the value bo,
  • C_offset is an m-by-n matrix defined by the co array of size len:
    • if offsetc = F: len must be at least 1,
    • if offsetc = C: len must be at least max(1, m),
    • if offsetc = R: len must be at least max(1, n),
  • alpha and beta are scalars, and
  • A, B and C are matrices, with op( A ) an m-by-k matrix, op( B ) a k-by-n matrix and C an m-by-n matrix.

The matrices are assumed to be stored in row-major order (the elements in a matrix rows are contiguous in memory).

Note
The API is different compared with the standard BLAS routine. In particular, the function returns dnnl_status_t for error handling. XERBLA is not supported: no error message will be printed in case of incorrect parameters.
Warning
On some architectures the intermediate saturation might happen, which would lead to unexpected results. For more details, refer to Int8 Computation Aspects.
Examples:
cpu_rnn_inference_int8.cpp.

◆ dnnl_gemm_s8s8s32()

dnnl_status_t DNNL_API dnnl_gemm_s8s8s32 ( char  transa,
char  transb,
char  offsetc,
dnnl_dim_t  M,
dnnl_dim_t  N,
dnnl_dim_t  K,
float  alpha,
const int8_t *  A,
dnnl_dim_t  lda,
int8_t  ao,
const int8_t *  B,
dnnl_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
dnnl_dim_t  ldc,
const int32_t *  co 
)

dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product.

For the final result, a vector is added to each row or column of the output matrix.

For full description, see dnnl_gemm_u8s8s32().

Warning
On some architectures the intermediate saturation might happen, which would lead to unexpected results. For more details, refer to Int8 Computation Aspects.