DPCT1118#

Message#

SYCL group functions and algorithms must be encountered in converged control flow.

Detailed Help#

SYCL* group functions and algorithms must be encountered in converged control flow for all work-items in a work-group. If a work-item completes the kernel and exits early without reaching a collective work-group operation like barrier, all the other work-items in the work-group reaching the collective operation will wait for the exited work-item.

Refer to How do I fix the issue of SYCL* code hanging due to work group level synchronization, such as a group barrier used in a conditional statement? for additional information.

Suggestions to Fix#

For example, this original CUDA* code:

__global__ void kernel(float *data) {
  int tid = threadIdx.x;
  if (tid < 32) {
    if (data[tid] < data[tid + 32]) {
      data[tid] = data[tid + 32];
    }
    __syncthreads();
    ...
  }
}

results in the following migrated SYCL code:

void kernel(float *data, const sycl::nd_item<3> &item_ct1) {
  int tid = item_ct1.get_local_id(2);
  if (tid < 32) {
    if (data[tid] < data[tid + 32]) {
      data[tid] = data[tid + 32];
    }
    /*
    DPCT1118:0: SYCL group functions and algorithms must be encountered in converged control flow. You should check this condition holds.
    */
    /*
    DPCT1065:1: Consider replacing sycl::nd_item::barrier() with sycl::nd_item::barrier(sycl::access::fence_space::local_space) for better performance if there is no access to global memory.
    */
    item_ct1.barrier();
  }
}

which is rewritten to:

void kernel(float *data, const sycl::nd_item<3> &item_ct1) {
  int tid = item_ct1.get_local_id(2);

  if (tid < 32) {
    if (data[tid] < data[tid + 32]) {
      data[tid] = data[tid + 32];
    }
  }
  item_ct1.barrier();
  ...
}

DPCT1118

Contents

DPCT1118#

Message#

Detailed Help#

Suggestions to Fix#