Caching of Communication Operations#
Communication operations may have expensive initialization phase (for example, allocation of internal structures and buffers, registration of memory buffers, handshake with peers, and so on). oneCCL amortizes these overheads by caching operation internal representations and reusing them on the subsequent calls.
To control this, use operation attribute and set
true value for
to_cache field and unique string (for example, tensor name) for
match_idshould be the same for a specific communication operation across all ranks.
If the same tensor is a part of different communication operations,
match_idshould have different values for each of these operations.