.. index:: pair: group; Threadpool interoperability API
.. _doxidgroup__dnnl__api__threadpool__interop:
Threadpool interoperability API
===============================
.. toctree::
:hidden:
namespace_dnnl_threadpool_interop.rst
Overview
~~~~~~~~
API extensions to interact with the underlying Threadpool runtime. :ref:`More...`
.. refcodeblock:: cpp
:class: doxyrestoverviewcodeblock
// namespaces
namespace :ref:`dnnl::threadpool_interop`;
// global functions
:ref:`dnnl_status_t` DNNL_API :ref:`dnnl_threadpool_interop_stream_create`(
:ref:`dnnl_stream_t`* stream,
:ref:`dnnl_engine_t` engine,
void* threadpool
);
:ref:`dnnl_status_t` DNNL_API :ref:`dnnl_threadpool_interop_stream_get_threadpool`(
:ref:`dnnl_stream_t` astream,
void** threadpool
);
:ref:`dnnl_status_t` DNNL_API :ref:`dnnl_threadpool_interop_set_max_concurrency`(int max_concurrency);
:ref:`dnnl_status_t` DNNL_API :ref:`dnnl_threadpool_interop_get_max_concurrency`(int* max_concurrency);
:ref:`dnnl_status_t` DNNL_API :ref:`dnnl_threadpool_interop_sgemm`(
char transa,
char transb,
:ref:`dnnl_dim_t` M,
:ref:`dnnl_dim_t` N,
:ref:`dnnl_dim_t` K,
float alpha,
const float* A,
:ref:`dnnl_dim_t` lda,
const float* B,
:ref:`dnnl_dim_t` ldb,
float beta,
float* C,
:ref:`dnnl_dim_t` ldc,
void* threadpool
);
:ref:`dnnl_status_t` DNNL_API :ref:`dnnl_threadpool_interop_gemm_u8s8s32`(
char transa,
char transb,
char offsetc,
:ref:`dnnl_dim_t` M,
:ref:`dnnl_dim_t` N,
:ref:`dnnl_dim_t` K,
float alpha,
const uint8_t* A,
:ref:`dnnl_dim_t` lda,
uint8_t ao,
const int8_t* B,
:ref:`dnnl_dim_t` ldb,
int8_t bo,
float beta,
int32_t* C,
:ref:`dnnl_dim_t` ldc,
const int32_t* co,
void* threadpool
);
:ref:`dnnl_status_t` DNNL_API :ref:`dnnl_threadpool_interop_gemm_s8s8s32`(
char transa,
char transb,
char offsetc,
:ref:`dnnl_dim_t` M,
:ref:`dnnl_dim_t` N,
:ref:`dnnl_dim_t` K,
float alpha,
const int8_t* A,
:ref:`dnnl_dim_t` lda,
int8_t ao,
const int8_t* B,
:ref:`dnnl_dim_t` ldb,
int8_t bo,
float beta,
int32_t* C,
:ref:`dnnl_dim_t` ldc,
const int32_t* co,
void* threadpool
);
.. _detailsgroup__dnnl__api__threadpool__interop:
Detailed Documentation
~~~~~~~~~~~~~~~~~~~~~~
API extensions to interact with the underlying Threadpool runtime.
Global Functions

.. index:: pair: function; dnnl_threadpool_interop_stream_create
.. _doxidgroup__dnnl__api__threadpool__interop_1ga45a92b2adda6ff7a31784d73cbd61c26:
.. refcodeblock:: cpp
:class: doxyresttitlecodeblock
:ref:`dnnl_status_t` DNNL_API dnnl_threadpool_interop_stream_create(
:ref:`dnnl_stream_t`* stream,
:ref:`dnnl_engine_t` engine,
void* threadpool
)
Creates an execution stream with specified threadpool.
.. rubric:: Parameters:
.. listtable::
:widths: 20 80
*
 stream
 Output execution stream.
*
 engine
 Engine to create the execution stream on.
*
 threadpool
 Pointer to an instance of a C++ class that implements dnnl::threapdool_iface interface.
.. rubric:: Returns:
:ref:`dnnl_success ` on success and a status describing the error otherwise.
.. rubric:: See also:
:ref:`Using oneDNN with ThreadpoolBased Threading `
.. index:: pair: function; dnnl_threadpool_interop_stream_get_threadpool
.. _doxidgroup__dnnl__api__threadpool__interop_1ga1dac9e0e17855a5196cb96295277a86d:
.. refcodeblock:: cpp
:class: doxyresttitlecodeblock
:ref:`dnnl_status_t` DNNL_API dnnl_threadpool_interop_stream_get_threadpool(
:ref:`dnnl_stream_t` astream,
void** threadpool
)
Returns a threadpool to be used by the execution stream.
.. rubric:: Parameters:
.. listtable::
:widths: 20 80
*
 astream
 Execution stream.
*
 threadpool
 Output pointer to an instance of a C++ class that implements dnnl::threapdool_iface interface. Set to NULL if the stream was created without threadpool.
.. rubric:: Returns:
:ref:`dnnl_success ` on success and a status describing the error otherwise.
.. rubric:: See also:
:ref:`Using oneDNN with ThreadpoolBased Threading `
.. index:: pair: function; dnnl_threadpool_interop_set_max_concurrency
.. _doxidgroup__dnnl__api__threadpool__interop_1ga39b66c510d4f46fde238fedb4343fa2d:
.. refcodeblock:: cpp
:class: doxyresttitlecodeblock
:ref:`dnnl_status_t` DNNL_API dnnl_threadpool_interop_set_max_concurrency(int max_concurrency)
Sets the maximum concurrency assumed by oneDNN when outside a parallel call.
.. rubric:: Parameters:
.. listtable::
:widths: 20 80
*
 max_concurrency
 The maximum concurrency assumed by oneDNN when outside a parallel call. This is a threadlocal setting.
.. rubric:: Returns:
:ref:`dnnl_success ` on success and a status describing the error otherwise.
.. index:: pair: function; dnnl_threadpool_interop_get_max_concurrency
.. _doxidgroup__dnnl__api__threadpool__interop_1ga4017c3cc0fe9b66a8d50352015d0dbcf:
.. refcodeblock:: cpp
:class: doxyresttitlecodeblock
:ref:`dnnl_status_t` DNNL_API dnnl_threadpool_interop_get_max_concurrency(int* max_concurrency)
Gets the maximum concurrency assumed by oneDNN when outside a parallel call.
.. rubric:: Parameters:
.. listtable::
:widths: 20 80
*
 max_concurrency
 The maximum concurrency assumed by oneDNN when outside a parallel call. This is a threadlocal setting.
.. rubric:: Returns:
:ref:`dnnl_success ` on success and a status describing the error otherwise.
.. index:: pair: function; dnnl_threadpool_interop_sgemm
.. _doxidgroup__dnnl__api__threadpool__interop_1ga2726272c8ce83f4231cc81e326336193:
.. refcodeblock:: cpp
:class: doxyresttitlecodeblock
:ref:`dnnl_status_t` DNNL_API dnnl_threadpool_interop_sgemm(
char transa,
char transb,
:ref:`dnnl_dim_t` M,
:ref:`dnnl_dim_t` N,
:ref:`dnnl_dim_t` K,
float alpha,
const float* A,
:ref:`dnnl_dim_t` lda,
const float* B,
:ref:`dnnl_dim_t` ldb,
float beta,
float* C,
:ref:`dnnl_dim_t` ldc,
void* threadpool
)
Performs singleprecision matrixmatrix multiply.
The operation is defined as:
``C := alpha * op( A ) * op( B ) + beta * C``
where
* ``op( X ) = X`` or ``op( X ) = X**T``,
* ``alpha`` and ``beta`` are scalars, and
* ``A``, ``B``, and ``C`` are matrices:
* ``op( A )`` is an ``MxK`` matrix,
* ``op( B )`` is an ``KxN`` matrix,
* ``C`` is an ``MxN`` matrix.
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
.. note::
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
.. rubric:: Parameters:
.. listtable::
:widths: 20 80
*
 transa
 Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
*
 transb
 Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
*
 M
 The M dimension.
*
 N
 The N dimension.
*
 K
 The K dimension.
*
 alpha
 The alpha parameter that is used to scale the product of matrices A and B.
*
 A
 A pointer to the A matrix data.
*
 lda
 The leading dimension for the matrix A.
*
 B
 A pointer to the B matrix data.
*
 ldb
 The leading dimension for the matrix B.
*
 beta
 The beta parameter that is used to scale the matrix C.
*
 C
 A pointer to the C matrix data.
*
 ldc
 The leading dimension for the matrix C.
*
 threadpool
 A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime).
.. rubric:: Returns:
:ref:`dnnl_success ` / :ref:`dnnl::status::success ` on success and a status describing the error otherwise.
.. index:: pair: function; dnnl_threadpool_interop_gemm_u8s8s32
.. _doxidgroup__dnnl__api__threadpool__interop_1gaeb14ed904eaed73cfd01a29bc2a4ac1e:
.. refcodeblock:: cpp
:class: doxyresttitlecodeblock
:ref:`dnnl_status_t` DNNL_API dnnl_threadpool_interop_gemm_u8s8s32(
char transa,
char transb,
char offsetc,
:ref:`dnnl_dim_t` M,
:ref:`dnnl_dim_t` N,
:ref:`dnnl_dim_t` K,
float alpha,
const uint8_t* A,
:ref:`dnnl_dim_t` lda,
uint8_t ao,
const int8_t* B,
:ref:`dnnl_dim_t` ldb,
int8_t bo,
float beta,
int32_t* C,
:ref:`dnnl_dim_t` ldc,
const int32_t* co,
void* threadpool
)
Performs integer matrixmatrix multiply on 8bit unsigned matrix A, 8bit signed matrix B, and 32bit signed resulting matrix C.
The operation is defined as:
``C := alpha * (op(A)  A_offset) * (op(B)  B_offset) + beta * C + C_offset``
where
* ``op( X ) = X`` or ``op( X ) = X**T``,
* ``alpha`` and ``beta`` are scalars, and
* ``A``, ``B``, and ``C`` are matrices:
* ``op( A )`` is an ``MxK`` matrix,
* ``op( B )`` is an ``KxN`` matrix,
* ``C`` is an ``MxN`` matrix.
* ``A_offset`` is an ``MxK`` matrix with every element equal the ``ao`` value,
* ``B_offset`` is an ``KxN`` matrix with every element equal the ``bo`` value,
* ``C_offset`` is an ``MxN`` matrix which is defined by the ``co`` array of size ``len`` :
* if ``offsetc = F`` : the ``len`` must be at least ``1``,
* if ``offsetc = C`` : the ``len`` must be at least ``max(1, m)``,
* if ``offsetc = R`` : the ``len`` must be at least ``max(1, n)``,
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
.. note::
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
.. warning::
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to :ref:`Nuances of int8 Computations `.
.. rubric:: Parameters:
.. listtable::
:widths: 20 80
*
 transa
 Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
*
 transb
 Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
*
 offsetc

Flag specifying how offsets should be applied to matrix C:
* 'F' means that the same offset will be applied to each element of the matrix C,
* 'C' means that individual offset will be applied to each element within each column,
* 'R' means that individual offset will be applied to each element within each row.
*
 M
 The M dimension.
*
 N
 The N dimension.
*
 K
 The K dimension.
*
 alpha
 The alpha parameter that is used to scale the product of matrices A and B.
*
 A
 A pointer to the A matrix data.
*
 lda
 The leading dimension for the matrix A.
*
 ao
 The offset value for the matrix A.
*
 B
 A pointer to the B matrix data.
*
 ldb
 The leading dimension for the matrix B.
*
 bo
 The offset value for the matrix B.
*
 beta
 The beta parameter that is used to scale the matrix C.
*
 C
 A pointer to the C matrix data.
*
 ldc
 The leading dimension for the matrix C.
*
 co
 An array of offset values for the matrix C. The number of elements in the array depends on the value of ``offsetc``.
*
 threadpool
 A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime).
.. rubric:: Returns:
:ref:`dnnl_success ` / :ref:`dnnl::status::success ` on success and a status describing the error otherwise.
.. index:: pair: function; dnnl_threadpool_interop_gemm_s8s8s32
.. _doxidgroup__dnnl__api__threadpool__interop_1ga39dd6bf602ca1ebb2039eb4c27f07fdf:
.. refcodeblock:: cpp
:class: doxyresttitlecodeblock
:ref:`dnnl_status_t` DNNL_API dnnl_threadpool_interop_gemm_s8s8s32(
char transa,
char transb,
char offsetc,
:ref:`dnnl_dim_t` M,
:ref:`dnnl_dim_t` N,
:ref:`dnnl_dim_t` K,
float alpha,
const int8_t* A,
:ref:`dnnl_dim_t` lda,
int8_t ao,
const int8_t* B,
:ref:`dnnl_dim_t` ldb,
int8_t bo,
float beta,
int32_t* C,
:ref:`dnnl_dim_t` ldc,
const int32_t* co,
void* threadpool
)
Performs integer matrixmatrix multiply on 8bit signed matrix A, 8bit signed matrix B, and 32bit signed resulting matrix C.
The operation is defined as:
``C := alpha * (op(A)  A_offset) * (op(B)  B_offset) + beta * C + C_offset``
where
* ``op( X ) = X`` or ``op( X ) = X**T``,
* ``alpha`` and ``beta`` are scalars, and
* ``A``, ``B``, and ``C`` are matrices:
* ``op( A )`` is an ``MxK`` matrix,
* ``op( B )`` is an ``KxN`` matrix,
* ``C`` is an ``MxN`` matrix.
* ``A_offset`` is an ``MxK`` matrix with every element equal the ``ao`` value,
* ``B_offset`` is an ``KxN`` matrix with every element equal the ``bo`` value,
* ``C_offset`` is an ``MxN`` matrix which is defined by the ``co`` array of size ``len`` :
* if ``offsetc = F`` : the ``len`` must be at least ``1``,
* if ``offsetc = C`` : the ``len`` must be at least ``max(1, m)``,
* if ``offsetc = R`` : the ``len`` must be at least ``max(1, n)``,
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
.. note::
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
.. warning::
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to :ref:`Nuances of int8 Computations `.
.. rubric:: Parameters:
.. listtable::
:widths: 20 80
*
 transa
 Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed.
*
 transb
 Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed.
*
 offsetc

Flag specifying how offsets should be applied to matrix C:
* 'F' means that the same offset will be applied to each element of the matrix C,
* 'C' means that individual offset will be applied to each element within each column,
* 'R' means that individual offset will be applied to each element within each row.
*
 M
 The M dimension.
*
 N
 The N dimension.
*
 K
 The K dimension.
*
 alpha
 The alpha parameter that is used to scale the product of matrices A and B.
*
 A
 A pointer to the A matrix data.
*
 lda
 The leading dimension for the matrix A.
*
 ao
 The offset value for the matrix A.
*
 B
 A pointer to the B matrix data.
*
 ldb
 The leading dimension for the matrix B.
*
 bo
 The offset value for the matrix B.
*
 beta
 The beta parameter that is used to scale the matrix C.
*
 C
 A pointer to the C matrix data.
*
 ldc
 The leading dimension for the matrix C.
*
 co
 An array of offset values for the matrix C. The number of elements in the array depends on the value of ``offsetc``.
*
 threadpool
 A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime).
.. rubric:: Returns:
:ref:`dnnl_success ` / :ref:`dnnl::status::success ` on success and a status describing the error otherwise.