API extensions to interact with the underlying Threadpool run-time. More...

Namespaces
	dnnl::threadpool_interop
	Threadpool interoperability namespace.

Functions
dnnl_status_t DNNL_API	dnnl_threadpool_interop_stream_create (dnnl_stream_t stream, dnnl_engine_t engine, void threadpool)
	Creates an execution stream with specified threadpool. More...

dnnl_status_t DNNL_API	dnnl_threadpool_interop_stream_get_threadpool (dnnl_stream_t astream, void **threadpool)
	Returns a threadpool to be used by the execution stream. More...

dnnl_status_t DNNL_API	dnnl_threadpool_interop_sgemm (char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float A, dnnl_dim_t lda, const float B, dnnl_dim_t ldb, float beta, float C, dnnl_dim_t ldc, void threadpool)
	Performs single-precision matrix-matrix multiply. More...

dnnl_status_t DNNL_API	dnnl_threadpool_interop_gemm_u8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t A, dnnl_dim_t lda, uint8_t ao, const int8_t B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t C, dnnl_dim_t ldc, const int32_t co, void *threadpool)
	Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More...

dnnl_status_t DNNL_API	dnnl_threadpool_interop_gemm_s8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t A, dnnl_dim_t lda, int8_t ao, const int8_t B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t C, dnnl_dim_t ldc, const int32_t co, void *threadpool)
	Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More...

Detailed Description

API extensions to interact with the underlying Threadpool run-time.

Function Documentation

◆ dnnl_threadpool_interop_stream_create()

dnnl_status_t DNNL_API dnnl_threadpool_interop_stream_create	(	dnnl_stream_t *	stream,
		dnnl_engine_t	engine,
		void *	threadpool
	)

Creates an execution stream with specified threadpool.

See also: Using oneDNN with Threadpool-based Threading

Parameters

stream	Output execution stream.
engine	Engine to create the execution stream on.
threadpool	Pointer to an instance of a C++ class that implements dnnl::threapdool_iface interface.

Returns: dnnl_success on success and a status describing the error otherwise.

◆ dnnl_threadpool_interop_stream_get_threadpool()

dnnl_status_t DNNL_API dnnl_threadpool_interop_stream_get_threadpool	(	dnnl_stream_t	astream,
		void **	threadpool
	)

Returns a threadpool to be used by the execution stream.

See also: Using oneDNN with Threadpool-based Threading

Parameters

astream	Execution stream.
threadpool	Output pointer to an instance of a C++ class that implements dnnl::threapdool_iface interface. Set to NULL if the stream was created without threadpool.

Returns: dnnl_success on success and a status describing the error otherwise.

◆ dnnl_threadpool_interop_sgemm()

dnnl_status_t DNNL_API dnnl_threadpool_interop_sgemm	(	char	transa,
		char	transb,
		dnnl_dim_t	M,
		dnnl_dim_t	N,
		dnnl_dim_t	K,
		float	alpha,
		const float *	A,
		dnnl_dim_t	lda,
		const float *	B,
		dnnl_dim_t	ldb,
		float	beta,
		float *	C,
		dnnl_dim_t	ldc,
		void *	threadpool
	)

Performs single-precision matrix-matrix multiply.

The operation is defined as:

C := alpha * op( A ) * op( B ) + beta * C

where

op( X ) = X or op( X ) = X**T,
alpha and beta are scalars, and
A, B, and C are matrices:
- op( A ) is an MxK matrix,
- op( B ) is an KxN matrix,
- C is an MxN matrix.

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note: This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.

Parameters

transa	Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transb	Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
M	The M dimension.
N	The N dimension.
K	The K dimension.
alpha	The alpha parameter that is used to scale the product of matrices A and B.
A	A pointer to the A matrix data.
lda	The leading dimension for the matrix A.
B	A pointer to the B matrix data.
ldb	The leading dimension for the matrix B.
beta	The beta parameter that is used to scale the matrix C.
C	A pointer to the C matrix data.
ldc	The leading dimension for the matrix C.

Returns: dnnl_success/dnnl::status::success on success and a status describing the error otherwise.

Parameters

threadpool A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime).

◆ dnnl_threadpool_interop_gemm_u8s8s32()

dnnl_status_t DNNL_API dnnl_threadpool_interop_gemm_u8s8s32	(	char	transa,
		char	transb,
		char	offsetc,
		dnnl_dim_t	M,
		dnnl_dim_t	N,
		dnnl_dim_t	K,
		float	alpha,
		const uint8_t *	A,
		dnnl_dim_t	lda,
		uint8_t	ao,
		const int8_t *	B,
		dnnl_dim_t	ldb,
		int8_t	bo,
		float	beta,
		int32_t *	C,
		dnnl_dim_t	ldc,
		const int32_t *	co,
		void *	threadpool
	)

Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.

The operation is defined as:

C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset

where

op( X ) = X or op( X ) = X**T,
alpha and beta are scalars, and
A, B, and C are matrices:
- op( A ) is an MxK matrix,
- op( B ) is an KxN matrix,
- C is an MxN matrix.
A_offset is an MxK matrix with every element equal the ao value,
B_offset is an KxN matrix with every element equal the bo value,
C_offset is an MxN matrix which is defined by the co array of size len:
- if offsetc = F: the len must be at least 1,
- if offsetc = C: the len must be at least max(1, m),
- if offsetc = R: the len must be at least max(1, n),

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note: This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.

Warning: On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.

Parameters

transa	Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transb	Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
offsetc	Flag specifying how offsets should be applied to matrix C: 'F' means that the same offset will be applied to each element of the matrix C, 'C' means that individual offset will be applied to each element within each column, 'R' means that individual offset will be applied to each element within each row.
M	The M dimension.
N	The N dimension.
K	The K dimension.
alpha	The alpha parameter that is used to scale the product of matrices A and B.
A	A pointer to the A matrix data.
lda	The leading dimension for the matrix A.
ao	The offset value for the matrix A.
B	A pointer to the B matrix data.
ldb	The leading dimension for the matrix B.
bo	The offset value for the matrix B.
beta	The beta parameter that is used to scale the matrix C.
C	A pointer to the C matrix data.
ldc	The leading dimension for the matrix C.
co	An array of offset values for the matrix C. The number of elements in the array depends on the value of `offsetc`.

Returns: dnnl_success/dnnl::status::success on success and a status describing the error otherwise.

Parameters

threadpool A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime).

◆ dnnl_threadpool_interop_gemm_s8s8s32()

dnnl_status_t DNNL_API dnnl_threadpool_interop_gemm_s8s8s32	(	char	transa,
		char	transb,
		char	offsetc,
		dnnl_dim_t	M,
		dnnl_dim_t	N,
		dnnl_dim_t	K,
		float	alpha,
		const int8_t *	A,
		dnnl_dim_t	lda,
		int8_t	ao,
		const int8_t *	B,
		dnnl_dim_t	ldb,
		int8_t	bo,
		float	beta,
		int32_t *	C,
		dnnl_dim_t	ldc,
		const int32_t *	co,
		void *	threadpool
	)

Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.

The operation is defined as:

C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset

where

op( X ) = X or op( X ) = X**T,
alpha and beta are scalars, and
A, B, and C are matrices:
- op( A ) is an MxK matrix,
- op( B ) is an KxN matrix,
- C is an MxN matrix.
A_offset is an MxK matrix with every element equal the ao value,
B_offset is an KxN matrix with every element equal the bo value,
C_offset is an MxN matrix which is defined by the co array of size len:
- if offsetc = F: the len must be at least 1,
- if offsetc = C: the len must be at least max(1, m),
- if offsetc = R: the len must be at least max(1, n),

The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).

Note: This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.

Warning: On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.

Parameters

transa	Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
transb	Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
offsetc	Flag specifying how offsets should be applied to matrix C: 'F' means that the same offset will be applied to each element of the matrix C, 'C' means that individual offset will be applied to each element within each column, 'R' means that individual offset will be applied to each element within each row.
M	The M dimension.
N	The N dimension.
K	The K dimension.
alpha	The alpha parameter that is used to scale the product of matrices A and B.
A	A pointer to the A matrix data.
lda	The leading dimension for the matrix A.
ao	The offset value for the matrix A.
B	A pointer to the B matrix data.
ldb	The leading dimension for the matrix B.
bo	The offset value for the matrix B.
beta	The beta parameter that is used to scale the matrix C.
C	A pointer to the C matrix data.
ldc	The leading dimension for the matrix C.
co	An array of offset values for the matrix C. The number of elements in the array depends on the value of `offsetc`.

Returns: dnnl_success/dnnl::status::success on success and a status describing the error otherwise.

Parameters

threadpool A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime).

Namespaces

Functions

Detailed Description

Function Documentation

◆ dnnl_threadpool_interop_stream_create()

◆ dnnl_threadpool_interop_stream_get_threadpool()

◆ dnnl_threadpool_interop_sgemm()

◆ dnnl_threadpool_interop_gemm_u8s8s32()

◆ dnnl_threadpool_interop_gemm_s8s8s32()