DPCT1087 [UPDATE]#
Message#
SYCL currently does not support cross group synchronization. You can
specify --use-experimental-features=root-group
to use the root-group to migrate
<synchronization API call>.
Detailed Help#
By default, the SYCL root group extension is not used to migrate CUDA* grid level
synchronization. To use root-group
to migrate CUDA grid level synchronization,
specify --use-experimental-features=root-group
in the migration command.
Suggestions to Fix#
Specify --use-experimental-features= root-group
in the migration command to use
the root-group to migrate CUDA grid level synchronization.
For example, this original CUDA* code:
1__global__ void kernel() {
2 namespace cg = cooperative_groups;
3 cg::grid_group grid = cg::this_grid();
4 grid.sync();
5}
6
7void foo() {
8 kernel<<<1, 64>>>();
9}
results in the following migrated SYCL code:
1void kernel() {
2
3 /*
4 DPCT1119:1: Migration of cooperative_groups::__v1::grid_group::this_grid is not
5 supported, please try to remigrate with option:
6 --use-experimental-features=root-group.
7 */
8 /*
9 DPCT1119:2: Migration of cooperative_groups::__v1::grid_group is not
10 supported, please try to remigrate with option:
11 --use-experimental-features=root-group.
12 */
13 cg::grid_group grid = cg::this_grid();
14 /*
15 DPCT1087:0: SYCL currently does not support cross group synchronization. You
16 can specify "--use-experimental-features=root-group" to use the dpct
17 helper function nd_range_barrier to migrate grid.sync().
18 */
19 grid.sync();
20}
21
22void foo() {
23 dpct::get_in_order_queue().parallel_for(
24 sycl::nd_range<3>(sycl::range<3>(1, 1, 64), sycl::range<3>(1, 1, 64)),
25 [=](sycl::nd_item<3> item_ct1) {
26 kernel();
27 });
28}
which is rewritten to:
1void kernel(const sycl::nd_item<3> &item_ct1,
2 sycl::ext::oneapi::experimental::root_group grid =
3 item_ct1.ext_oneapi_get_root_group();
4 sycl::group_barrier(grid);
5}
6
7void foo() {
8 auto exp_props = sycl::ext::oneapi::experimental::properties{
9 sycl::ext::oneapi::experimental::use_root_sync};
10 dpct::get_in_order_queue().parallel_for(
11 sycl::nd_range<3>(sycl::range<3>(1, 1, 64), sycl::range<3>(1, 1, 64)),
12 exp_props, [=](sycl::nd_item<3> item_ct1) {
13 kernel(item_ct1);
14 [=](sycl::nd_item<3> item_ct1) {
15 });
16}