Caching of Communication Operations#
Communication operations may have expensive initialization phase (for example, allocation of internal structures and buffers, registration of memory buffers, handshake with peers, and so on). oneCCL amortizes these overheads by caching operation internal representations and reusing them on the subsequent calls.
To control this, use operation attribute and set true
value for to_cache
field and unique string (for example, tensor name) for match_id
field.
Note that:
match_id
should be the same for a specific communication operation across all ranks.If the same tensor is a part of different communication operations,
match_id
should have different values for each of these operations.