DPCT1019#
Message#
local_mem_size
in SYCL is not a complete equivalent of <variable name> in
CUDA. You may need to adjust the code.
Detailed Help#
In CUDA*, the sharedMemPerBlock
reports the size of the shared memory in bytes
available per block. The SYCL* equivalent of a CUDA block is a work-group. The SYCL
equivalent of shared memory is local memory. There is no limitation on the size of
the local memory per work-group in SYCL. There is a limit on the maximum size of
the local memory in bytes available per compute unit, which is exposed by the
info::device::local_mem_size
device descriptor in SYCL.
Suggestions to Fix#
Verify the code correctness.
For example, this original CUDA code:
1void foo() {
2 cudaDeviceProp prop;
3 cudaGetDeviceProperties(&prop, 0);
4 if (prop.sharedMemPerBlock >= threshold) {
5 // submit the task
6 Code piece A
7 } else {
8 // change the block size or block number
9 Code piece B
10 }
11}
results in the following migrated SYCL code:
1void foo() {
2 dpct::device_info prop;
3 dpct::dev_mgr::instance().get_device(0).get_device_info(prop);
4 /*
5 DPCT1019:0: local_mem_size in SYCL is not a complete equivalent of
6 sharedMemPerBlock in CUDA. You may need to adjust the code.
7 */
8 if (prop.get_local_mem_size() >= threshold) {
9 // submit the task
10 Code piece A
11 } else {
12 // change the block size or block number
13 Code piece B
14 }
15}
which is rewritten to:
1void foo() {
2 dpct::device_info prop;
3 dpct::dev_mgr::instance().get_device(0).get_device_info(prop);
4 if (prop.get_local_mem_size() >= threshold) {
5 // submit the task
6 Code piece A
7 } else {
8 // change the block size or block number
9 Code piece B
10 }
11}