Frequently Asked Questions#

General Information

Troubleshooting Migration

General Information#

How do I migrate source files that use C++20 or newer standard features on Linux* and Windows*?#

On Linux, the default C++ standard for SYCLomatic’s parser is C++17. If you want to enable newer standard features in SYCLomatic, you need to add the --extra-arg="-std=<value>" option to the command line. The supported values are:

  • c++14

  • c++17

  • c++20

  • c++23

  • c++26

On Windows, the default C++ standard for SYCLomatic’s parser is C++17. If you want to enable C20 features in SYCLomatic, you need to add the option --extra-arg="-std=c++20" to the command line.

How do I migrate files on Windows when using a CMake project?#

For a CMake project on a Windows OS, you can use CMake to generate Microsoft Visual Studio* project files (vcxproj files). Then choose one of the following options:

  • Migrate the source files on the command line by using the --vcxprojfile option of SYCLomatic.

  • Migrate the entire project in Microsoft Visual Studio with an SYCLomatic Microsoft Visual Studio plugin.

How is the migrated code formatted?#

SYCLomatic provides two options to control the format of migrated code: --format-range and --format-style .

If input source code is well formatted, SYCLomatic will use default options settings --format-range and --format-style to format the resulting code.

If input source code is not well formatted (for example, the tool detects mixed use of tabs and spaces or mixed indents) you can do one of the following:

  • SYCLomatic will try to detect the indent size of the original code and apply it to the resulting code. You can guide the tool by setting TabWidth and UseTab in the .clang-format file. Because the input source code is not well formatted, the indents in the resulting code may still be inconsistent.

  • Run SYCLomatic with the --format-range=all option to format the entire resulting file. The change between input source code and resulting source code may be large and make it more difficult to compare the code.

  • Format your input source code, then use SYCLomatic with the same .clang-format file for migration.

Why does the compilation database not contain all source files in the project?#

In the project build folder, the command intercept-build make [target] is used to generate the compilation database. The content of the compilation database depends on the optional [target] parameter. If you need to get the list of files corresponding to default build target, do not specify the [target] parameter.

Make sure to disable ccache (compiler cache) in your project before using intercept-build. If ccache is enabled, intercept-build cannot generate the complete compilation database as some compile commands may be skipped if the target objects are already available in the cache. Use the following command to disable ccache before running the intercept-build command:

export CCACHE_DISABLE=1

How do I use the migrated module file in the new project?#

.cu module files are compiled with the -ptx or -cubin options in the original project and dynamically loaded into other *.cu files with cuModuleLoad() or cuModuleLoadData().

SYCLomatic migrates module file code in the same way as other *.cu files. In addition, it adds a wrapper function for each function in the module file that has the _global_ attribute.

You can compile the migrated module file into a dynamic library and load the library with a dynamic library API appropriate to your platform. For example:

  • In Linux, load a dynamic library (.so) using dlopen()

  • In Windows, load a dynamic library (.dll) using LoadLibraryA()

Is the memory space allocated by sycl::malloc_device, sycl::malloc_host, and dpct::dpct_malloc initialized?#

The memory allocated by sycl::malloc_device, sycl::malloc_host, and dpct::dpct_malloc is not initialized. If your program explicitly or implicitly relies on the initial value of newly allocated memory, the program may fail at runtime. Adjust your code to avoid such failures.

For example, the following original code:

1// original code
2
3int *device_mem = nullptr;device_mem = sycl::malloc_device<int>(size, dpct::get_default_queue());
4device_mem[0] += somevalue;

is adjusted to initialize the newly allocated memory to 0 before use:

1// fixed SYCL code
2
3int *device_mem = nullptr;device_mem = sycl::malloc_device<int>(size, dpct::get_default_queue());
4dpct::get_default_queue().memset(0, size).wait();
5device_mem[0] += somevalue;

How do I migrate CUDA* source code that contains CUB library implementation source code?#

If you migrate the CUB library implementation code directly, you may not get the expected results. Instead, exclude CUB library implementation source code from your migration by adding --in-root-exclude=<path to CUB library source code> to your migration command.

How do I fix the issue of SYCL* code hanging due to work group level synchronization, such as a group barrier used in a conditional statement?#

If synchronization API in control flow statements like a conditional statement and loop statement are called in SYCL code, you may encounter a runtime hang issue. The basic idea to fix the hang issue is to ensure that each synchronization API is either reached by all work items of a workgroup, or skipped by all the work items of a workgroup.

Here are two examples of how to fix:

In the first example, the synchronization API group barrier(nd_item.barrier()) is called inside an if block. The evaluation results of the conditional statement differ in each work item so not all work items can reach the synchronization API.

 1// original code
 2
 3void kernel(const sycl::nd_item<3> &item_ct1) {
 4   unsigned int tid = item_ct1.get_local_id(2);
 5   if (tid < 32) {
 6      // CODE block 1
 7      ...
 8      item_ct1.barrier(sycl::access::fence_space::local_space);
 9      // CODE block 2
10      ...
11   }
12}

The following code shows how to fix the hang issue by moving the synchronization statement out of the if block.

 1// fixed SYCL code
 2
 3void kernel(const sycl::nd_item<3> &item_ct1) {
 4   unsigned int tid = item_ct1.get_local_id(2);
 5   if (tid < 32) {
 6      // CODE block 1
 7      ...
 8   }
 9   item_ct1.barrier(sycl::access::fence_space::local_space);
10   if (tid < 32) {
11      // CODE block 2
12      ...
13   }
14}

The second example demonstrates how to fix the hang issue when a synchronization API is used in a for loop:

 1// original code
 2
 3void compute(int id_space, const sycl::nd_item<3> &item_ct1) {
 4   unsigned int id = item_ct1.get_group(2) * item_ct1.get_local_range(2) + item_ct1.get_local_id(2);
 5   for (; id < id_space; id += item_ct1.get_group_range(2) * item_ct1.get_local_range(2)) {
 6      ...
 7      item_ct1.barrier();
 8      ...
 9   }
10}

The following code shows how to fix the hang issue by making sure all work items have same run footprint in the for loop.

 1// fixed SYCL code
 2
 3void compute(int id_space, const sycl::nd_item<3> &item_ct1) {
 4   unsigned int id = item_ct1.get_group(2) * item_ct1.get_local_range(2) + item_ct1.get_local_id(2);
 5   unsigned int num_workitem = item_ct1.get_group_range(2) * item_ct1.get_local_range(2);
 6   // The condition is updated to make sure all work items can enter the loop body in each iteration
 7   for (; id < ((id_space + num_workitem - 1) / num_workitem) * num_workitem;
 8   id += item_ct1.get_group_range(2) * item_ct1.get_local_range(2)) {
 9      ...
10      item_ct1.barrier();
11      ...
12   }
13}

Troubleshooting Migration#

How do I fix an error such as “error: unknown type name” when I migrate files with “dpct –in-root=srcdir –out-root=dstdir *.cu”?#

The problem may be caused by files in the *.cu list, which can be used as header files (included with an #include statement) and are not supposed to be parsed as a standalone file. In this case, SYCLomatic reports an error if it cannot parse the file because the file depends on the definitions/declarations in other files. Use one of the methods below to migrate your content:

  • Rely on SYCLomatic to decide which files to migrate with: compile_commands.json: "dpct -p=compile_commands.json --in-root=srcdir --out-root=dstdir"

  • Manually pass specific files to migrate, but do not pass the files that are included in other files and not supposed to be compiled as a standalone file in the original application. The header files are migrated automatically when they are included by the files provided as the input to the tool and are located within the in-root folder: dpct --in-root= srcdir --out-root=dstdir sample.cu

How do I fix a parsing error such as “no member named ‘max’ in namespace ‘std’” or “no member named ‘min’ in namespace ‘std’” when migrating code on Windows?#

Use one of the following methods to resolve the error:

  • Add #include <algorithm> to the source file before using std::min and std::max

  • Define the NOMINMAX macro by inserting #define NOMINMAX before including WinDef.h

How do I fix a compilation error such as “error: dlopen not declared” when I compile code on a Windows machine, that was originally migrated on Linux?#

When SYCLomatic generates the source code, it uses dynamic loading APIs specific to the OS on which SYCLomatic is running.

For example, dlopen, dlclose, and dlsym are used on Linux and LoadLibraryA, FreeLibrary, and GetProcAddress are used on Windows.

If your code was migrated on a OS that is different from the OS you need to compile the generated code on, migrate the project again with the SYCLomatic on the target OS or fix the code manually.

Why didn’t the “atomic*” APIs get migrated?#

SYCLomatic may assume that the “atomic*” APIs are user-defined APIs, in which case they are not migrated.

This can occur in the following scenarios:

  • The CUDA include path is specified by both --cuda-include-path and -I*, but the paths are different

  • The CUDA include path is specified by -I*, but there are other CUDA include files located on the default CUDA install path

To make sure “atomic*” APIs are migrated, don’t use -I* to specify the CUDA include path with the dpct migration command. Instead, use only --cuda-include-path to specify the CUDA include path.

Why did my migration fail with “error: restrict requires a pointer or reference”?#

The C++ standard does not support the restrict qualifier and the C standard supports the restrict qualifier only on pointers to an object type.

Based on these language standards SYCLomatic emits the parsing error.

You may need to adjust the source code.

How do I resolve migration failure with “fatal error: ‘cmath’ file not found” in Linux?#

The problem stems from an absent include path for the new standard C++ library. The SYCLomatic is designed to automatically detect the appropriate version of the C++ header file by checking the compiler package at /usr/lib/gcc/x86_64-linux-gnu``and C++ header at ``/usr/include/c++. In the following example, it tries to use C++ header version 12 based on the knowledge of the compiler package, but it fails because C++ header version 12 does not exist.”

1ls /usr/lib/gcc/x86_64-linux-gnu
211 12
3ls /usr/include/c++
411

To fix this issue, please install the version 12 g++ package or libstdc++ package.

1sudo apt install g++-12
2or
3sudo apt install libstdc++-12-dev

If your installation differs, install the missing version of g++-XX or libstdc++-XX-dev based on what you see missing from the results of “ls /usr/lib/gcc/x86_64-linux-gnu” and “ls /usr/include/c++”.

How do I resolve incorrect runtime behavior for dpct::dev_mgr and dpct:mem_mgr in a library project that is loaded more than once in another application?#

dpct::dev_mgr and dpct::mem_mgr are singleton classes in the SYCLomatic helper functions. When the helper function headers are used to build an executable project, both dpct::dev_mgr and dpct::mem_mgr will have only one instance in the executable. However, when the helper function headers are used to build a library project and the library project is loaded more than once with dlopen() (or LoadLibraryA() for Windows) in an application, more than two instances of dpct::dev_mgr and dpct::mem_mgr will be created and result in incorrect runtime behavior.

For example, both files libA.cpp and libB.cpp include SYCLomatic helper function header dpct.hpp, and they are built into dynamic libraries libA.so and libB.so respectively. If an application main.cpp imports the libraries with dlopen(), there will be two instances of dpct::dev_mgr and dpct::mem_mgr in the runtime of the application.

To resolve this issue, separate the implementation and the declaration of dpct::dev_mgr and dpct::mem_mgr in SYCLomatic helper function:

  1. Create a new C++ file dpct_helper.cpp.

  2. Move the implementation of instance() in class dev_mgr from dpct/device.hpp to dpct_helper.cpp.

    For example, the original dpct/device.hpp:

    1 class dev_mgr {
    2 public:
    3   static dev_mgr &instance() { // the implementation and the declaration of dev_mgr::instance
    4     static dev_mgr d_m;
    5     return d_m;
    6   }
    7   ...
    8 }
    

    is updated to:

    1 class dev_mgr {
    2 public:
    3   static dev_mgr &instance();//the declaration of dev_mgr::instance
    4   ...
    5 }
    

    and the new dpct_helper.cpp now contains the implementation of dev_mgr::instance():

    1#include <dpct/device.hpp>
    2dpct::dev_mgr &dev_mgr::instance(){ // the implementation of dev_mgr::instance
    3  static dev_mgr d_m;
    4  return d_m;
    5}
    
  3. Similar to step two, move the implementation of instance() in the class mem_mgr from dpct/memory.hpp to dpct_helper.cpp.

  4. Build dpct_helper.cpp into a dynamic library libdpct_helper.

    • In Linux:

      dpcpp -g -shared -o libdpct_helper.so -fPIC ./dpct_helper.cpp
      
    • In Windows:

      cl.exe /LD dpct_helper.cpp
      
  5. Add library libdpct_helper to the environment variables.

    • In Linux: Add the location of libdpct_helper.so into LD_LIBRARY_PATH.

    • In Windows: Add the location of libdpct_helper.dll into PATH.

  6. Dynamically link libdpct_helper when building libraries and applications.

After performing the update steps, all the libraries and applications will share the same instance of the device manager dpct::dev_mgr and the memory manager dpct::mem_mgr in SYCLomatic helper functions.

Why do I get “warning: shift count >= width of type” when I compile migrated code with the Intel® oneAPI DPC++/C++ Compiler?#

Shifting bits where the shift is greater than the type length is undefined behavior for the Intel® oneAPI DPC++/C++ Compiler and may result in different behavior on different devices. Adjust your code to avoid this type of shift.

For example, the migrated SYCL* code:

1// migrated SYCL code
2
3void foo() {
4  ...
5  unsigned int bit = index[tid] % 32;
6  unsigned int val = in[tid] << 32 - bit;
7  ...
8}

is adjusted to avoid a bit shift that is greater than the type length:

 1// fixed SYCL code
 2
 3void foo() {
 4  ...
 5  unsigned int bit = index[tid] % 32;
 6  unsigned int val;
 7  if(32 - bit == 32)
 8    val = 0;
 9  else
10    val = in[tid] << 32 - bit;
11  ...
12}

How do I resolve missing include errors that occur when migrating my code?#

Use the option --extra-arg=-v to prompt SYCLomatic to use verbose output, which includes information about which paths the tool searches for includes.

You can provide an additional path to look for includes in one of the following ways:

  • Use the --extra-arg="-I<extra include path>" option in your migration command to specify an additional path for the tool to use when searching for includes during migration.

  • If you are using a compilation database, add the -I<extra include path> option to the compile command in the database for the source files, to specify the include path.