.. _onemkl_blas_dgmm_batch: dgmm_batch ========== Computes a group of ``dgmm`` operations. .. _onemkl_blas_dgmm_batch_description: .. rubric:: Description The ``dgmm_batch`` routines perform multiple diagonal matrix-matrix product operations in a single call. ``dgmm_batch`` supports the following precisions. .. list-table:: :header-rows: 1 * - T * - ``float`` * - ``double`` * - ``std::complex`` * - ``std::complex`` .. _onemkl_blas_dgmm_batch_buffer: dgmm_batch (Buffer Version) --------------------------- .. rubric:: Description The buffer version of ``dgmm_batch`` supports only the strided API. The strided API operation is defined as: :: for i = 0 … batch_size – 1 A and C are matrices at offset i * stridea in a, i * stridec in c. X is a vector at offset i * stridex in x C := diag(X) * A or C = A * diag(X) end for where: ``A`` is a matrix, ``X`` is a diagonal matrix stored as a vector The ``a`` and ``x`` buffers contain all the input matrices. The stride between matrices is given by the stride parameter. The total number of matrices in ``a`` and ``x`` buffers is given by the ``batch_size`` parameter. **Strided API** .. rubric:: Syntax .. code-block:: cpp namespace oneapi::mkl::blas::column_major { void dgmm_batch(sycl::queue &queue, onemkl::mkl::side left_right, std::int64_t m, std::int64_t n, sycl::buffer &a, std::int64_t lda, std::int64_t stridea, sycl::buffer &x, std::int64_t incx, std::int64_t stridex, sycl::buffer &c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size) } .. code-block:: cpp namespace oneapi::mkl::blas::row_major { void dgmm_batch(sycl::queue &queue, onemkl::mkl::side left_right, std::int64_t m, std::int64_t n, sycl::buffer &a, std::int64_t lda, std::int64_t stridea, sycl::buffer &x, std::int64_t incx, std::int64_t stridex, sycl::buffer &c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size) } .. container:: section .. rubric:: Input Parameters queue The queue where the routine should be executed. left_right Specifies the position of the diagonal matrix in the product. See :ref:`onemkl_datatypes` for more details. m Number of rows of matrices ``A`` and ``C``. Must be at least zero. n Number of columns of matrices ``A`` and ``C``. Must be at least zero. a Buffer holding the input matrices ``A`` with size ``stridea`` * ``batch_size``. Must be of at least ``lda`` * ``j`` + ``stridea`` * (``batch_size`` - 1) where j is n if column major layout is used or m if major layout is used. lda The leading dimension of the matrices ``A``. It must be positive and at least ``m`` if column major layout is used or at least ``n`` if row major layout is used. stridea Stride between different ``A`` matrices. x Buffer holding the input matrices ``X`` with size ``stridex`` * ``batch_size``. Must be of size at least (1 + (``len`` - 1)*abs(``incx``)) + ``stridex`` * (``batch_size`` - 1) where ``len`` is ``n`` if the diagonal matrix is on the right of the product or ``m`` otherwise. incx Stride between two consecutive elements of the ``x`` vectors. stridex Stride between different ``X`` vectors, must be at least 0. c Buffer holding input/output matrices ``C`` with size ``stridec`` * ``batch_size``. ldc The leading dimension of the matrices ``C``. It must be positive and at least ``m`` if column major layout is used to store matrices or at least ``n`` if column major layout is used to store matrices. stridec Stride between different ``C`` matrices. Must be at least ``ldc`` * ``n`` if column major layout is used or ``ldc`` * ``m`` if row major layout is used. batch_size Specifies the number of diagonal matrix-matrix product operations to perform. .. container:: section .. rubric:: Output Parameters c Output overwritten by ``batch_size`` diagonal matrix-matrix product operations. .. _onemkl_blas_dgmm_batch_usm: dgmm_batch (USM Version) --------------------------- .. rubric:: Description The USM version of ``dgmm_batch`` supports the group API and strided API. The group API operation is defined as: :: idx = 0 for i = 0 … group_count – 1 for j = 0 … group_size – 1 a and c are matrices of size mxn at position idx in a_array and c_array x is a vector of size m or n depending on left_right, at position idx in x_array if (left_right == oneapi::mkl::side::left) c := diag(x) * a else c := a * diag(x) idx := idx + 1 end for end for The strided API operation is defined as :: for i = 0 … batch_size – 1 A and C are matrices at offset i * stridea in a, i * stridec in c. X is a vector at offset i * stridex in x C := diag(X) * A or C = A * diag(X) end for where: ``A`` is a matrix, ``X`` is a diagonal matrix stored as a vector The ``a`` and ``x`` buffers contain all the input matrices. The stride between matrices is given by the stride parameter. The total number of matrices in ``a`` and ``x`` buffers is given by the ``batch_size`` parameter. For group API, ``a`` and ``x`` arrays contain the pointers for all the input matrices. The total number of matrices in ``a`` and ``x`` are given by: .. math:: total\_batch\_count = \sum_{i=0}^{group\_count-1}group\_size[i] For strided API, ``a`` and ``x`` arrays contain all the input matrices. The total number of matrices in ``a`` and ``x`` are given by the ``batch_size`` parameter. **Group API** .. rubric:: Syntax .. code-block:: cpp namespace oneapi::mkl::blas::column_major { sycl::event dgmm_batch(sycl::queue &queue, onemkl::mkl::side *left_right, std::int64_t *m, std::int64_t *n, const T **a, std::int64_t *lda, const T **x, std::int64_t *incx, T **c, std::int64_t *ldc, std::int64_t group_count, std::int64_t *group_size, const std::vector &dependencies = {}) } .. code-block:: cpp namespace oneapi::mkl::blas::row_major { sycl::event dgmm_batch(sycl::queue &queue, onemkl::mkl::side *left_right, std::int64_t *m, std::int64_t *n, const T **a, std::int64_t *lda, const T **x, std::int64_t *incx, T **c, std::int64_t *ldc, std::int64_t group_count, std::int64_t *group_size, const std::vector &dependencies = {}) } .. container:: section .. rubric:: Input Parameters queue The queue where the routine should be executed. left_right Specifies the position of the diagonal matrix in the product. See :ref:`onemkl_datatypes` for more details. m Array of ``group_count`` integers. ``m[i]`` specifies the number of rows of ``A`` for every matrix in group ``i``. All entries must be at least zero. n Array of ``group_count`` integers. ``n[i]`` specifies the number of columns of ``A`` for every matrix in group ``i``. All entries must be at least zero. a Array of pointers to input matrices ``A`` with size ``total_batch_count``. Must be of size at least ``lda[i]`` * ``n[i]`` if column major layout is used or at least ``lda[i]`` * ``m[i]`` if row major layout is used. See :ref:`matrix-storage` for more details. lda Array of ``group_count`` integers. ``lda[i]`` specifies the leading dimension of ``A`` for every matrix in group ``i``. All entries must be positive and at least ``m[i]`` if column major layout is used or at least ``n[i]`` if row major layout is used. x Array of pointers to input vectors ``X`` with size ``total_batch_count``. Must be of size at least (1 + ``len[i]`` – 1)*abs(``incx[i]``)) where ``len[i]`` is ``n[i]`` if the diagonal matrix is on the right of the product or ``m[i]`` otherwise. See :ref:`matrix-storage` for more details. incx Array of ``group_count`` integers. ``incx[i]`` specifies the stride of ``x`` for every vector in group ``i``. All entries must be positive. c Array of pointers to input/output matrices ``C`` with size ``total_batch_count``. Must be of size at least ``ldc[i]`` * ``n[i]`` if column major layout is used or at least ``ldc[i]`` * ``m[i]`` if row major layout is used. See :ref:`matrix-storage` for more details. ldc Array of ``group_count`` integers. ``ldc[i]`` specifies the leading dimension of ``C`` for every matrix in group ``i``. All entries must be positive and ``ldc[i]`` must be at least ``m[i]`` if column major layout is used to store matrices or at least ``n[i]`` if row major layout is used to store matrices. group_count Specifies the number of groups. Must be at least 0. group_size Array of ``group_count`` integers. ``group_size[i]`` specifies the number of diagonal matrix-matrix product operations in group ``i``. All entries must be at least 0. dependencies List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies. .. container:: section .. rubric:: Output Parameters c Output overwritten by ``batch_size`` diagonal matrix-matrix product operations. .. container:: section .. rubric:: Return Values Output event to wait on to ensure computation is complete. **Strided API** .. rubric:: Syntax .. code-block:: cpp namespace oneapi::mkl::blas::column_major { sycl::event dgmm_batch(sycl::queue &queue, onemkl::mkl::side left_right, std::int64_t m, std::int64_t n, const T *a, std::int64_t lda, std::int64_t stridea, const T *b, std::int64_t incx, std::int64_t stridex, T *c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size, const std::vector &dependencies = {}) } .. code-block:: cpp namespace oneapi::mkl::blas::row_major { sycl::event dgmm_batch(sycl::queue &queue, onemkl::mkl::side left_right, std::int64_t m, std::int64_t n, const T *a, std::int64_t lda, std::int64_t stridea, const T *b, std::int64_t incx, std::int64_t stridex, T *c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size, const std::vector &dependencies = {}) } .. container:: section .. rubric:: Input Parameters queue The queue where the routine should be executed. left_right Specifies the position of the diagonal matrix in the product. See :ref:`onemkl_datatypes` for more details. m Number of rows of ``A``. Must be at least zero. n Number of columns of ``A``. Must be at least zero. a Pointer to input matrices ``A`` with size ``stridea`` * ``batch_size``. Must be of size at least ``lda`` * ``k`` + ``stridea`` * (``batch_size`` - 1) where ``k`` is ``n`` if column major layout is used or ``m`` if row major layout is used. lda The leading dimension of the matrices ``A``. It must be positive and at least ``m``. Must be positive and at least ``m`` if column major layout is used or at least ``n`` if row major layout is used. stridea Stride between different ``A`` matrices. x Pointer to input matrices ``X`` with size ``stridex`` * ``batch_size``. Must be of size at least (1 + (``len`` - 1)*abs(``incx``)) + ``stridex`` * (``batch_size`` - 1) where ``len`` is ``n`` if the diagonal matrix is on the right of the product or ``m`` otherwise. incx Stride between two consecutive elements of the ``x`` vector. stridex Stride between different ``X`` vectors, must be at least 0. c Pointer to input/output matrices ``C`` with size ``stridec`` * ``batch_size``. ldc The leading dimension of the matrices ``C``. It must be positive and at least ``ldc`` * ``m`` if column major layout is used to store matrices or at least ``ldc`` * ``n`` if column major layout is used to store matrices. stridec Stride between different ``C`` matrices. Must be at least ``ldc`` * ``n`` if column major layout is used or ``ldc`` * ``m`` if row major layout is used. batch_size Specifies the number of diagonal matrix-matrix product operations to perform. .. container:: section .. rubric:: Output Parameters c Output overwritten by ``batch_size`` diagonal matrix-matrix product operations. .. container:: section .. rubric:: Return Values Output event to wait on to ensure computation is complete. **Parent topic:** :ref:`blas-like-extensions`