Enabling OFI/verbs/dmabuf Support#
oneCCL provides experimental support for data transfers between Intel GPU memory and NIC using Linux dmabuf, which is exposed through OFI API for verbs provider.
Requirements#
Linux kernel version >= 5.12
RDMA core version >= 34.0
level-zero-devel package
Usage#
oneCCL, OFI and OFI/verbs from Intel® oneAPI Base Toolkit support device memory transfers. Refer to Run instructions for usage.
If you want to build software components from sources, refer to Build instructions.
Build instructions#
OFI#
git clone --single-branch --branch v1.13.2 https://github.com/ofiwg/libfabric.git
cd libfabric
./autogen.sh
./configure --prefix=<ofi_install_dir> --enable-verbs=<rdma_core_install_dir> --with-ze=<level_zero_install_dir> --enable-ze-dlopen=yes
make -j install
Note
You may also get OFI release package directly from here. No need to run autogen.sh if using the release package.
oneCCL#
cmake -DCMAKE_INSTALL_PREFIX=<ccl_install_dir> -DLIBFABRIC_DIR=<ofi_install_dir> -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCOMPUTE_BACKEND=dpcpp -DENABLE_OFI_HMEM=1 ..
make -j install
Run instructions#
Set the environment. See Get Started Guide.
Run allreduce test with ring algorithm and SYCL USM device buffers:
export CCL_ATL_TRANSPORT=ofi export CCL_ATL_HMEM=1 export CCL_ALLREDUCE=ring export FI_PROVIDER=verbs mpiexec -n 2 <ccl_install_dir>/examples/sycl/sycl_allreduce_usm_test gpu device