Host Communication#
The communication operations between processes are provided by Communicator.
The example below demonstrates the main concepts of communication on host memory buffers.
Example
Consider a simple oneCCL allreduce
example for CPU.
Create a communicator object with user-supplied size, rank, and key-value store:
auto ccl_context = ccl::create_context(); auto ccl_device = ccl::create_device(); auto comms = ccl::create_communicators( size, vector_class<pair_class<size_t, device>>{ { rank, ccl_device } }, ccl_context, kvs);
Or for convenience use non-vector form without device and context parameters.
auto comm = ccl::create_communicator(size, rank, kvs);
Initialize
send_buf
(in real scenario it is supplied by the user):const size_t elem_count = <N>; /* initialize send_buf */ for (idx = 0; idx < elem_count; idx++) { send_buf[idx] = rank + 1; }
allreduce
invocation performs the reduction of values from all the processes and then distributes the result to all the processes. In this case, the result is an array withelem_count
elements, where all elements are equal to the sum of arithmetical progression:\[p \cdot (p + 1) / 2\]ccl::allreduce(send_buf, recv_buf, elem_count, reduction::sum, comm).wait();
Check the correctness of
allreduce
operation:auto comm_size = comm.size(); auto expected = comm_size * (comm_size + 1) / 2; for (idx = 0; idx < elem_count; idx++) { if (recv_buf[idx] != expected) { std::count << "unexpected value at index " << idx << std::endl; break; } }