OpenCL interoperability API¶
Overview¶
API extensions to interact with the underlying OpenCL run-time. More…
// namespaces namespace dnnl::graph::ocl_interop; // typedefs typedef void* (*dnnl_graph_ocl_allocate_f)( size_t size, size_t alignment, cl_device_id device, cl_context context ); typedef void (*dnnl_graph_ocl_deallocate_f)( void *buf, cl_device_id device, cl_context context, cl_event event ); // global functions dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create( dnnl_graph_allocator_t* allocator, dnnl_graph_ocl_allocate_f ocl_malloc, dnnl_graph_ocl_deallocate_f ocl_free ); dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator( dnnl_engine_t* engine, cl_device_id device, cl_context context, const_dnnl_graph_allocator_t alloc ); dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator( dnnl_engine_t* engine, cl_device_id device, cl_context context, const_dnnl_graph_allocator_t alloc, size_t size, const uint8_t* cache_blob ); dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute( const_dnnl_graph_compiled_partition_t compiled_partition, dnnl_stream_t stream, size_t num_inputs, const_dnnl_graph_tensor_t* inputs, size_t num_outputs, const_dnnl_graph_tensor_t* outputs, const cl_event* deps, int ndeps, cl_event* return_event );
Detailed Documentation¶
API extensions to interact with the underlying OpenCL run-time.
Typedefs¶
typedef void* (*dnnl_graph_ocl_allocate_f)( size_t size, size_t alignment, cl_device_id device, cl_context context )
Allocation call-back function interface for OpenCL.
OpenCL allocator should be used for OpenCL GPU runtime. The call-back should return a USM device memory pointer.
Parameters:
size |
Memory size in bytes for requested allocation |
alignment |
The minimum alignment in bytes for the requested allocation |
device |
A valid OpenCL device used to allocate |
context |
A valid OpenCL context used to allocate |
Returns:
The memory address of the requested USM allocation.
typedef void (*dnnl_graph_ocl_deallocate_f)( void *buf, cl_device_id device, cl_context context, cl_event event )
Deallocation call-back function interface for OpenCL.
OpenCL allocator should be used for OpenCL runtime. The call-back should deallocate a USM device memory returned by dnnl_graph_ocl_allocate_f. The event should be completed before deallocate the USM.
Parameters:
buf |
The USM allocation to be released |
device |
A valid OpenCL device the USM associated with |
context |
A valid OpenCL context used to free the USM allocation |
event |
A event which the USM deallocation depends on |
Global Functions¶
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create( dnnl_graph_allocator_t* allocator, dnnl_graph_ocl_allocate_f ocl_malloc, dnnl_graph_ocl_deallocate_f ocl_free )
Creates an allocator with the given allocation and deallocation call-back function pointers.
Parameters:
allocator |
Output allocator |
ocl_malloc |
A pointer to OpenCL malloc function |
ocl_free |
A pointer to OpenCL free function |
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator( dnnl_engine_t* engine, cl_device_id device, cl_context context, const_dnnl_graph_allocator_t alloc )
This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create( dnnl_engine_t *engine, cl_device_id device, cl_context context);.
Parameters:
engine |
Output engine. |
device |
Underlying OpenCL device to use for the engine. |
context |
Underlying OpenCL context to use for the engine. |
alloc |
Underlying allocator to use for the engine. |
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator( dnnl_engine_t* engine, cl_device_id device, cl_context context, const_dnnl_graph_allocator_t alloc, size_t size, const uint8_t* cache_blob )
This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create_from_cache_blob( dnnl_engine_t *engine, cl_device_id device, cl_context context, size_t size, const uint8_t *cache_blob);.
Parameters:
engine |
Output engine. |
device |
The OpenCL device that this engine will encapsulate. |
context |
The OpenCL context (containing the device) that this engine will use for all operations. |
alloc |
Underlying allocator to use for the engine. |
size |
Size of the cache blob in bytes. |
cache_blob |
Cache blob of size |
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute( const_dnnl_graph_compiled_partition_t compiled_partition, dnnl_stream_t stream, size_t num_inputs, const_dnnl_graph_tensor_t* inputs, size_t num_outputs, const_dnnl_graph_tensor_t* outputs, const cl_event* deps, int ndeps, cl_event* return_event )
Execute a compiled partition with OpenCL runtime.
Parameters:
compiled_partition |
The handle of target compiled_partition. |
stream |
The stream used for execution |
num_inputs |
The number of input tensors |
inputs |
A list of input tensors |
num_outputs |
The number of output tensors |
outputs |
A non-empty list of output tensors |
deps |
Optional handle of list with |
ndeps |
Number of dependencies. |
return_event |
The handle of cl_event. |
Returns:
dnnl_success on success and a status describing the error otherwise.