Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN)  1.0.4
Performance library for Deep Learning
Functions
BLAS functions

A subset of Basic Linear ALgebra (BLAS) functions to perform matrix-matrix multiplication. More...

Functions

mkldnn_status_t MKLDNN_API mkldnn_sgemm (char transa, char transb, mkldnn_dim_t M, mkldnn_dim_t N, mkldnn_dim_t K, float alpha, const float *A, mkldnn_dim_t lda, const float *B, mkldnn_dim_t ldb, float beta, float *C, mkldnn_dim_t ldc)
 SGEMM performs a matrix-matrix multiplication operation defined as. More...
 
mkldnn_status_t MKLDNN_API mkldnn_gemm_u8s8s32 (char transa, char transb, char offsetc, mkldnn_dim_t M, mkldnn_dim_t N, mkldnn_dim_t K, float alpha, const uint8_t *A, mkldnn_dim_t lda, uint8_t ao, const int8_t *B, mkldnn_dim_t ldb, int8_t bo, float beta, int32_t *C, mkldnn_dim_t ldc, const int32_t *co)
 mkldnn_gemm_u8s8s32() and mkldnn_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product. More...
 
mkldnn_status_t MKLDNN_API mkldnn_gemm_s8s8s32 (char transa, char transb, char offsetc, mkldnn_dim_t M, mkldnn_dim_t N, mkldnn_dim_t K, float alpha, const int8_t *A, mkldnn_dim_t lda, int8_t ao, const int8_t *B, mkldnn_dim_t ldb, int8_t bo, float beta, int32_t *C, mkldnn_dim_t ldc, const int32_t *co)
 mkldnn_gemm_u8s8s32() and mkldnn_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product. More...
 

Detailed Description

A subset of Basic Linear ALgebra (BLAS) functions to perform matrix-matrix multiplication.

Function Documentation

◆ mkldnn_sgemm()

mkldnn_status_t MKLDNN_API mkldnn_sgemm ( char  transa,
char  transb,
mkldnn_dim_t  M,
mkldnn_dim_t  N,
mkldnn_dim_t  K,
float  alpha,
const float *  A,
mkldnn_dim_t  lda,
const float *  B,
mkldnn_dim_t  ldb,
float  beta,
float *  C,
mkldnn_dim_t  ldc 
)

SGEMM performs a matrix-matrix multiplication operation defined as.

C := alpha*op( A )*op( B ) + beta*C

where

  • op( X ) is one of op( X ) = X or op( X ) = X**T,
  • alpha and beta are scalars,
  • A, B and C are matrices, with op( A ) an m by k matrix, op( B ) a k by n matrix and C an m by n matrix.

The matrices are assumed to be stored in row-major order (the elements in a matrix rows are contiguous in memory).

Note
The API is different from the standard BLAS routine because it returns mkldnn_status_t for error handling. XERBLA is not supported: no error message will be printed in case of incorrect parameters.
Examples:
cpu_rnn_inference_f32.cpp, and cpu_rnn_inference_int8.cpp.

◆ mkldnn_gemm_u8s8s32()

mkldnn_status_t MKLDNN_API mkldnn_gemm_u8s8s32 ( char  transa,
char  transb,
char  offsetc,
mkldnn_dim_t  M,
mkldnn_dim_t  N,
mkldnn_dim_t  K,
float  alpha,
const uint8_t *  A,
mkldnn_dim_t  lda,
uint8_t  ao,
const int8_t *  B,
mkldnn_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
mkldnn_dim_t  ldc,
const int32_t *  co 
)

mkldnn_gemm_u8s8s32() and mkldnn_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product.

For the final result, a vector is added to each row or column of the output matrix.

The operation is defined as:

  • C := alpha*(op(A) - A_offset) * (op(B) - B_offset) + beta*C + C_offset

where

  • op( X ) = X or op( X ) = X**T,
  • A_offset is an m-by-k matrix with every element equal to the value ao,
  • B_offset is an k-by-n matrix with every element equal to the value bo,
  • C_offset is an m-by-n matrix defined by the co array of size len:
    • if offsetc = F: len must be at least 1,
    • if offsetc = C: len must be at least max(1, m),
    • if offsetc = R: len must be at least max(1, n),
  • alpha and beta are scalars, and
  • A, B and C are matrices, with op( A ) an m-by-k matrix, op( B ) a k-by-n matrix and C an m-by-n matrix.

The matrices are assumed to be stored in row-major order (the elements in a matrix rows are contiguous in memory).

Note
The API is different compared with the standard BLAS routine. In particular, the function returns mkldnn_status_t for error handling. XERBLA is not supported: no error message will be printed in case of incorrect parameters.
Warning
On some architectures the intermediate saturation might happen, which would lead to unexpected results. For more details, refer to Int8 Computation Aspects.
Examples:
cpu_rnn_inference_int8.cpp.

◆ mkldnn_gemm_s8s8s32()

mkldnn_status_t MKLDNN_API mkldnn_gemm_s8s8s32 ( char  transa,
char  transb,
char  offsetc,
mkldnn_dim_t  M,
mkldnn_dim_t  N,
mkldnn_dim_t  K,
float  alpha,
const int8_t *  A,
mkldnn_dim_t  lda,
int8_t  ao,
const int8_t *  B,
mkldnn_dim_t  ldb,
int8_t  bo,
float  beta,
int32_t *  C,
mkldnn_dim_t  ldc,
const int32_t *  co 
)

mkldnn_gemm_u8s8s32() and mkldnn_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product.

For the final result, a vector is added to each row or column of the output matrix.

For full description, see mkldnn_gemm_u8s8s32().

Warning
On some architectures the intermediate saturation might happen, which would lead to unexpected results. For more details, refer to Int8 Computation Aspects.