dgmm_batch#
Computes a group of dgmm
operations.
Description
The dgmm_batch
routines perform
multiple diagonal matrix-matrix product operations in a single call.
dgmm_batch
supports the following precisions.
T
float
double
std::complex<float>
std::complex<double>
dgmm_batch (Buffer Version)#
Description
The buffer version of dgmm_batch
supports only the strided API.
The strided API operation is defined as:
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea in a, i * stridec in c.
X is a vector at offset i * stridex in x
C := diag(X) * A or C = A * diag(X)
end for
where:
A
is a matrix,
X
is a diagonal matrix stored as a vector
The a
and x
buffers contain all the input matrices. The stride
between matrices is given by the stride parameter. The total number
of matrices in a
and x
buffers is given by the batch_size
parameter.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
void dgmm_batch(sycl::queue &queue,
onemkl::mkl::side left_right,
std::int64_t m,
std::int64_t n,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &x,
std::int64_t incx,
std::int64_t stridex,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size)
}
namespace oneapi::mkl::blas::row_major {
void dgmm_batch(sycl::queue &queue,
onemkl::mkl::side left_right,
std::int64_t m,
std::int64_t n,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &x,
std::int64_t incx,
std::int64_t stridex,
sycl::buffer<T,1> &c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size)
}
Input Parameters
- queue
The queue where the routine should be executed.
- left_right
Specifies the position of the diagonal matrix in the product. See oneMKL Defined Datatypes for more details.
- m
Number of rows of matrices
A
andC
. Must be at least zero.- n
Number of columns of matrices
A
andC
. Must be at least zero.
a
Buffer holding the input matrices
A
with sizestridea
*batch_size
. Must be of at leastlda
*j
+stridea
* (batch_size
- 1) where j is n if column major layout is used or m if major layout is used.
- lda
The leading dimension of the matrices
A
. It must be positive and at leastm
if column major layout is used or at leastn
if row major layout is used.- stridea
Stride between different
A
matrices.- x
Buffer holding the input matrices
X
with sizestridex
*batch_size
. Must be of size at least (1 + (len
- 1)*abs(incx
)) +stridex
* (batch_size
- 1) wherelen
isn
if the diagonal matrix is on the right of the product orm
otherwise.- incx
Stride between two consecutive elements of the
x
vectors.- stridex
Stride between different
X
vectors, must be at least 0.- c
Buffer holding input/output matrices
C
with sizestridec
*batch_size
.- ldc
The leading dimension of the matrices
C
. It must be positive and at leastm
if column major layout is used to store matrices or at leastn
if column major layout is used to store matrices.- stridec
Stride between different
C
matrices. Must be at leastldc
*n
if column major layout is used orldc
*m
if row major layout is used.- batch_size
Specifies the number of diagonal matrix-matrix product operations to perform.
Output Parameters
- c
Output overwritten by
batch_size
diagonal matrix-matrix product operations.
dgmm_batch (USM Version)#
Description
The USM version of dgmm_batch
supports the group API and strided API.
The group API operation is defined as:
idx = 0
for i = 0 … group_count – 1
for j = 0 … group_size – 1
a and c are matrices of size mxn at position idx in a_array and c_array
x is a vector of size m or n depending on left_right, at position idx in x_array
if (left_right == oneapi::mkl::side::left)
c := diag(x) * a
else
c := a * diag(x)
idx := idx + 1
end for
end for
The strided API operation is defined as
for i = 0 … batch_size – 1
A and C are matrices at offset i * stridea in a, i * stridec in c.
X is a vector at offset i * stridex in x
C := diag(X) * A or C = A * diag(X)
end for
where:
A
is a matrix,
X
is a diagonal matrix stored as a vector
The a
and x
buffers contain all the input matrices. The stride
between matrices is given by the stride parameter. The total number
of matrices in a
and x
buffers is given by the batch_size
parameter.
For group API, a
and x
arrays contain the pointers for all the input matrices.
The total number of matrices in a
and x
are given by:
For strided API, a
and x
arrays contain all the input matrices. The total number of matrices
in a
and x
are given by the batch_size
parameter.
Group API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event dgmm_batch(sycl::queue &queue,
onemkl::mkl::side *left_right,
std::int64_t *m,
std::int64_t *n,
const T **a,
std::int64_t *lda,
const T **x,
std::int64_t *incx,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event dgmm_batch(sycl::queue &queue,
onemkl::mkl::side *left_right,
std::int64_t *m,
std::int64_t *n,
const T **a,
std::int64_t *lda,
const T **x,
std::int64_t *incx,
T **c,
std::int64_t *ldc,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- left_right
Specifies the position of the diagonal matrix in the product. See oneMKL Defined Datatypes for more details.
- m
Array of
group_count
integers.m[i]
specifies the number of rows ofA
for every matrix in groupi
. All entries must be at least zero.- n
Array of
group_count
integers.n[i]
specifies the number of columns ofA
for every matrix in groupi
. All entries must be at least zero.- a
Array of pointers to input matrices
A
with sizetotal_batch_count
. Must be of size at leastlda[i]
*n[i]
if column major layout is used or at leastlda[i]
*m[i]
if row major layout is used. See Matrix Storage for more details.- lda
Array of
group_count
integers.lda[i]
specifies the leading dimension ofA
for every matrix in groupi
. All entries must be positive and at leastm[i]
if column major layout is used or at leastn[i]
if row major layout is used.- x
Array of pointers to input vectors
X
with sizetotal_batch_count
. Must be of size at least (1 +len[i]
– 1)*abs(incx[i]
)) wherelen[i]
isn[i]
if the diagonal matrix is on the right of the product orm[i]
otherwise. See Matrix Storage for more details.- incx
Array of
group_count
integers.incx[i]
specifies the stride ofx
for every vector in groupi
. All entries must be positive.- c
Array of pointers to input/output matrices
C
with sizetotal_batch_count
. Must be of size at leastldc[i]
*n[i]
if column major layout is used or at leastldc[i]
*m[i]
if row major layout is used. See Matrix Storage for more details.- ldc
Array of
group_count
integers.ldc[i]
specifies the leading dimension ofC
for every matrix in groupi
. All entries must be positive andldc[i]
must be at leastm[i]
if column major layout is used to store matrices or at leastn[i]
if row major layout is used to store matrices.- group_count
Specifies the number of groups. Must be at least 0.
- group_size
Array of
group_count
integers.group_size[i]
specifies the number of diagonal matrix-matrix product operations in groupi
. All entries must be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
Output overwritten by
batch_size
diagonal matrix-matrix product operations.
Return Values
Output event to wait on to ensure computation is complete.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event dgmm_batch(sycl::queue &queue,
onemkl::mkl::side left_right,
std::int64_t m,
std::int64_t n,
const T *a,
std::int64_t lda,
std::int64_t stridea,
const T *b,
std::int64_t incx,
std::int64_t stridex,
T *c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event dgmm_batch(sycl::queue &queue,
onemkl::mkl::side left_right,
std::int64_t m,
std::int64_t n,
const T *a,
std::int64_t lda,
std::int64_t stridea,
const T *b,
std::int64_t incx,
std::int64_t stridex,
T *c,
std::int64_t ldc,
std::int64_t stridec,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- left_right
Specifies the position of the diagonal matrix in the product. See oneMKL Defined Datatypes for more details.
- m
Number of rows of
A
. Must be at least zero.- n
Number of columns of
A
. Must be at least zero.- a
Pointer to input matrices
A
with sizestridea
*batch_size
. Must be of size at leastlda
*k
+stridea
* (batch_size
- 1) wherek
isn
if column major layout is used orm
if row major layout is used.- lda
The leading dimension of the matrices
A
. It must be positive and at leastm
. Must be positive and at leastm
if column major layout is used or at leastn
if row major layout is used.- stridea
Stride between different
A
matrices.- x
Pointer to input matrices
X
with sizestridex
*batch_size
. Must be of size at least (1 + (len
- 1)*abs(incx
)) +stridex
* (batch_size
- 1) wherelen
isn
if the diagonal matrix is on the right of the product orm
otherwise.- incx
Stride between two consecutive elements of the
x
vector.- stridex
Stride between different
X
vectors, must be at least 0.- c
Pointer to input/output matrices
C
with sizestridec
*batch_size
.- ldc
The leading dimension of the matrices
C
. It must be positive and at leastldc
*m
if column major layout is used to store matrices or at leastldc
*n
if column major layout is used to store matrices.- stridec
Stride between different
C
matrices. Must be at leastldc
*n
if column major layout is used orldc
*m
if row major layout is used.- batch_size
Specifies the number of diagonal matrix-matrix product operations to perform.
Output Parameters
- c
Output overwritten by
batch_size
diagonal matrix-matrix product operations.
Return Values
Output event to wait on to ensure computation is complete.
Parent topic: BLAS-like Extensions