A subset of Basic Linear ALgebra (BLAS) functions to perform matrix-matrix multiplication. More...

Functions
dnnl_status_t DNNL_API	dnnl_sgemm (char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float A, dnnl_dim_t lda, const float B, dnnl_dim_t ldb, float beta, float *C, dnnl_dim_t ldc)
	SGEMM performs a matrix-matrix multiplication operation defined as. More...

dnnl_status_t DNNL_API	dnnl_gemm_u8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t A, dnnl_dim_t lda, uint8_t ao, const int8_t B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t C, dnnl_dim_t ldc, const int32_t co)
	dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product. More...

dnnl_status_t DNNL_API	dnnl_gemm_s8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t A, dnnl_dim_t lda, int8_t ao, const int8_t B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t C, dnnl_dim_t ldc, const int32_t co)
	dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product. More...

Detailed Description

A subset of Basic Linear ALgebra (BLAS) functions to perform matrix-matrix multiplication.

Function Documentation

dnnl_status_t DNNL_API dnnl_sgemm	(	char	transa,
		char	transb,
		dnnl_dim_t	M,
		dnnl_dim_t	N,
		dnnl_dim_t	K,
		float	alpha,
		const float *	A,
		dnnl_dim_t	lda,
		const float *	B,
		dnnl_dim_t	ldb,
		float	beta,
		float *	C,
		dnnl_dim_t	ldc
	)

SGEMM performs a matrix-matrix multiplication operation defined as.

C := alpha*op( A )*op( B ) + beta*C

where

op( X ) is one of op( X ) = X or op( X ) = X**T,
alpha and beta are scalars,
A, B and C are matrices, with op( A ) an m by k matrix, op( B ) a k by n matrix and C an m by n matrix.

The matrices are assumed to be stored in row-major order (the elements in a matrix rows are contiguous in memory).

Note: The API is different from the standard BLAS routine because it returns dnnl_status_t for error handling. XERBLA is not supported: no error message will be printed in case of incorrect parameters.

dnnl_status_t DNNL_API dnnl_gemm_u8s8s32	(	char	transa,
		char	transb,
		char	offsetc,
		dnnl_dim_t	M,
		dnnl_dim_t	N,
		dnnl_dim_t	K,
		float	alpha,
		const uint8_t *	A,
		dnnl_dim_t	lda,
		uint8_t	ao,
		const int8_t *	B,
		dnnl_dim_t	ldb,
		int8_t	bo,
		float	beta,
		int32_t *	C,
		dnnl_dim_t	ldc,
		const int32_t *	co
	)

dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product.

For the final result, a vector is added to each row or column of the output matrix.

The operation is defined as:

where

op( X ) = X or op( X ) = X**T,
A_offset is an m-by-k matrix with every element equal to the value ao,
B_offset is an k-by-n matrix with every element equal to the value bo,
C_offset is an m-by-n matrix defined by the co array of size len:
- if offsetc = F: len must be at least 1,
- if offsetc = C: len must be at least max(1, m),
- if offsetc = R: len must be at least max(1, n),
alpha and beta are scalars, and
A, B and C are matrices, with op( A ) an m-by-k matrix, op( B ) a k-by-n matrix and C an m-by-n matrix.

The matrices are assumed to be stored in row-major order (the elements in a matrix rows are contiguous in memory).

Note: The API is different compared with the standard BLAS routine. In particular, the function returns dnnl_status_t for error handling. XERBLA is not supported: no error message will be printed in case of incorrect parameters.

Warning: On some architectures the intermediate saturation might happen, which would lead to unexpected results. For more details, refer to Int8 Computation Aspects.

dnnl_status_t DNNL_API dnnl_gemm_s8s8s32	(	char	transa,
		char	transb,
		char	offsetc,
		dnnl_dim_t	M,
		dnnl_dim_t	N,
		dnnl_dim_t	K,
		float	alpha,
		const int8_t *	A,
		dnnl_dim_t	lda,
		int8_t	ao,
		const int8_t *	B,
		dnnl_dim_t	ldb,
		int8_t	bo,
		float	beta,
		int32_t *	C,
		dnnl_dim_t	ldc,
		const int32_t *	co
	)

dnnl_gemm_u8s8s32() and dnnl_gemm_s8s8s32() perform a matrix-matrix multiplication operation and add the result to a scalar-matrix product.

For the final result, a vector is added to each row or column of the output matrix.

For full description, see dnnl_gemm_u8s8s32().

Warning: On some architectures the intermediate saturation might happen, which would lead to unexpected results. For more details, refer to Int8 Computation Aspects.