trsm_batch#
Computes a group of trsm
operations.
Description
The trsm_batch
routines are batched versions of trsm, performing
multiple trsm
operations in a single call. Each trsm
solves an equation of the form op(A) * X = alpha * B or X * op(A) = alpha * B.
trsm_batch
supports the following precisions.
T
float
double
std::complex<float>
std::complex<double>
trsm_batch (Buffer Version)#
Description
The buffer version of trsm_batch
supports only the strided API.
The strided API operation is defined as:
for i = 0 … batch_size – 1
A and B are matrices at offset i * stridea and i * strideb in a and b.
if (left_right == onemkl::side::left) then
compute X such that op(A) * X = alpha * B
else
compute X such that X * op(A) = alpha * B
end if
B := X
end for
where:
op(A
) is one of op(A
) = A
, or op(A) = A
T,
or op(A
) = A
H,
alpha
is a scalar,
A
is a triangular matrix,
B
and X
are m
x n
general matrices,
A
is either m
x m
or n
x n
,depending on whether
it multiplies X
on the left or right. On return, the matrix B
is overwritten by the solution matrix X
.
The a
and b
buffers contain all the input matrices. The stride
between matrices is given by the stride parameter. The total number
of matrices in a
and b
buffers are given by the batch_size
parameter.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
void trsm_batch(sycl::queue &queue,
onemkl::side left_right,
onemkl::uplo upper_lower,
onemkl::transpose trans,
onemkl::diag unit_diag,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &b,
std::int64_t ldb,
std::int64_t strideb,
std::int64_t batch_size)
}
namespace oneapi::mkl::blas::row_major {
void trsm_batch(sycl::queue &queue,
onemkl::side left_right,
onemkl::uplo upper_lower,
onemkl::transpose trans,
onemkl::diag unit_diag,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T,1> &a,
std::int64_t lda,
std::int64_t stridea,
sycl::buffer<T,1> &b,
std::int64_t ldb,
std::int64_t strideb,
std::int64_t batch_size)
}
Input Parameters
- queue
The queue where the routine should be executed.
- left_right
Specifies whether the matrices
A
multiplyX
on the left (side::left
) or on the right (side::right
). See oneMKL Defined Datatypes for more details.- upper_lower
Specifies whether the matrices
A
are upper or lower triangular. See oneMKL Defined Datatypes for more details.- trans
Specifies op(
A
), the transposition operation applied to the matricesA
. See oneMKL Defined Datatypes for more details.- unit_diag
Specifies whether the matrices
A
are assumed to be unit triangular (all diagonal elements are 1). See oneMKL Defined Datatypes for more details.- m
Number of rows of the
B
matrices. Must be at least zero.- n
Number of columns of the
B
matrices. Must be at least zero.- alpha
Scaling factor for the solutions.
- a
Buffer holding the input matrices
A
with sizestridea
*batch_size
.- lda
Leading dimension of the matrices
A
. Must be at leastm
ifleft_right
=side::left
, and at leastn
ifleft_right
=side::right
. Must be positive.- stridea
Stride between different
A
matrices.- b
Buffer holding the input matrices
B
with sizestrideb
*batch_size
.- ldb
Leading dimension of the matrices
B
. It must be positive and at leastm
if column major layout is used to store matrices or at leastn
if row major layout is used to store matrices.- strideb
Stride between different
B
matrices.- batch_size
Specifies the number of triangular linear systems to solve.
Output Parameters
- b
Output buffer, overwritten by
batch_size
solution matricesX
.
Notes
If alpha
= 0, matrix B
is set to zero and the matrices A
and B
do not need to be initialized before calling trsm_batch
.
Description
The USM version of trsm_batch
supports the group API and strided API.
The group API operation is defined as:
idx = 0
for i = 0 … group_count – 1
for j = 0 … group_size – 1
A and B are matrices in a[idx] and b[idx]
if (left_right == onemkl::side::left) then
compute X such that op(A) * X = alpha[i] * B
else
compute X such that X * op(A) = alpha[i] * B
end if
B := X
idx = idx + 1
end for
end for
The strided API operation is defined as:
for i = 0 … batch_size – 1
A and B are matrices at offset i * stridea and i * strideb in a and b.
if (left_right == onemkl::side::left) then
compute X such that op(A) * X = alpha * B
else
compute X such that X * op(A) = alpha * B
end if
B := X
end for
where:
op(A
) is one of op(A
) = A
, or op(A) = A
T,
or op(A
) = A
H,
alpha
is a scalar,
A
is a triangular matrix,
B
and X
are m
x n
general matrices,
A
is either m
x m
or n
x n
,depending on whether
it multiplies X
on the left or right. On return, the matrix B
is overwritten by the solution matrix X
.
For group API, a
and b
arrays contain the pointers for all the input matrices.
The total number of matrices in a
and b
are given by:
For strided API, a
and b
arrays contain all the input matrices. The total number of matrices
in a
and b
are given by the batch_size
parameter.
Group API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event trsm_batch(sycl::queue &queue,
onemkl::side *left_right,
onemkl::uplo *upper_lower,
onemkl::transpose *trans,
onemkl::diag *unit_diag,
std::int64_t *m,
std::int64_t *n,
T *alpha,
const T **a,
std::int64_t *lda,
T **b,
std::int64_t *ldb,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event trsm_batch(sycl::queue &queue,
onemkl::side *left_right,
onemkl::uplo *upper_lower,
onemkl::transpose *trans,
onemkl::diag *unit_diag,
std::int64_t *m,
std::int64_t *n,
T *alpha,
const T **a,
std::int64_t *lda,
T **b,
std::int64_t *ldb,
std::int64_t group_count,
std::int64_t *group_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- left_right
Array of
group_count
onemkl::side
values.left_right[i]
specifies whetherA
multipliesX
on the left (side::left
) or on the right (side::right
) for everytrsm
operation in groupi
. See oneMKL Defined Datatypes for more details.- upper_lower
Array of
group_count
onemkl::uplo
values.upper_lower[i]
specifies whetherA
is upper or lower triangular for every matrix in groupi
. See oneMKL Defined Datatypes for more details.- trans
Array of
group_count
onemkl::transpose
values.trans[i]
specifies the form of op(A
) used for everytrsm
operation in groupi
. See oneMKL Defined Datatypes for more details.- unit_diag
Array of
group_count
onemkl::diag
values.unit_diag[i]
specifies whetherA
is assumed to be unit triangular (all diagonal elements are 1) for every matrix in groupi
. See oneMKL Defined Datatypes for more details.- m
Array of
group_count
integers.m[i]
specifies the number of rows ofB
for every matrix in groupi
. All entries must be at least zero.- n
Array of
group_count
integers.n[i]
specifies the number of columns ofB
for every matrix in groupi
. All entries must be at least zero.- alpha
Array of
group_count
scalar elements.alpha[i]
specifies the scaling factor in groupi
.- a
Array of pointers to input matrices
A
with sizetotal_batch_count
. See Matrix Storage for more details.- lda
Array of
group_count
integers.lda[i]
specifies the leading dimension ofA
for every matrix in groupi
. All entries must be at leastm
ifleft_right
isside::left
, and at leastn
ifleft_right
isside::right
. All entries must be positive.- b
Array of pointers to input matrices
B
with sizetotal_batch_count
. See Matrix Storage for more details.- ldb
Array of
group_count
integers.ldb[i]
specifies the leading dimension ofB
for every matrix in groupi
. All entries must be positive and at leastm
and positive if column major layout is used to store matrices or at leastn
if row major layout is used to store matrices.- group_count
Specifies the number of groups. Must be at least 0.
- group_size
Array of
group_count
integers.group_size[i]
specifies the number oftrsm
operations in groupi
. All entries must be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- b
Output buffer, overwritten by the
total_batch_count
solution matricesX
.
Notes
If alpha
= 0, matrix B
is set to zero and the matrices A
and B
do not need to be initialized before calling trsm_batch
.
Return Values
Output event to wait on to ensure computation is complete.
Strided API
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event trsm_batch(sycl::queue &queue,
onemkl::side left_right,
onemkl::uplo upper_lower,
onemkl::transpose trans,
onemkl::diag unit_diag,
std::int64_t m,
std::int64_t n,
T alpha,
const T *a,
std::int64_t lda,
std::int64_t stridea,
T *b,
std::int64_t ldb,
std::int64_t strideb,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event trsm_batch(sycl::queue &queue,
onemkl::side left_right,
onemkl::uplo upper_lower,
onemkl::transpose trans,
onemkl::diag unit_diag,
std::int64_t m,
std::int64_t n,
T alpha,
const T *a,
std::int64_t lda,
std::int64_t stridea,
T *b,
std::int64_t ldb,
std::int64_t strideb,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
The queue where the routine should be executed.
- left_right
Specifies whether the matrices
A
multiplyX
on the left (side::left
) or on the right (side::right
). See oneMKL Defined Datatypes for more details.- upper_lower
Specifies whether the matrices
A
are upper or lower triangular. See oneMKL Defined Datatypes for more details.- trans
Specifies op(
A
), the transposition operation applied to the matricesA
. See oneMKL Defined Datatypes for more details.- unit_diag
Specifies whether the matrices
A
are assumed to be unit triangular (all diagonal elements are 1). See oneMKL Defined Datatypes for more details.- m
Number of rows of the
B
matrices. Must be at least zero.- n
Number of columns of the
B
matrices. Must be at least zero.- alpha
Scaling factor for the solutions.
- a
Pointer to input matrices
A
with sizestridea
*batch_size
.- lda
Leading dimension of the matrices
A
. Must be at leastm
ifleft_right
=side::left
, and at leastn
ifleft_right
=side::right
. Must be positive.- stridea
Stride between different
A
matrices.- b
Pointer to input matrices
B
with sizestrideb
*batch_size
.- ldb
Leading dimension of the matrices
B
. It must be positive and at leastm
if column major layout is used to store matrices or at leastn
if row major layout is used to store matrices.- strideb
Stride between different
B
matrices.- batch_size
Specifies the number of triangular linear systems to solve.
Output Parameters
- b
Output matrices, overwritten by
batch_size
solution matricesX
.
Notes
If alpha
= 0, matrix B
is set to zero and the matrices A
and B
do not need to be initialized before calling trsm_batch
.
Return Values
Output event to wait on to ensure computation is complete.
Parent topic: BLAS-like Extensions