OpenCL interoperability API

Overview

API extensions to interact with the underlying OpenCL run-time. More…

// namespaces

namespace dnnl::graph::ocl_interop;

// typedefs

typedef void* (*dnnl_graph_ocl_allocate_f)(
    size_t size,
    size_t alignment,
    cl_device_id device,
    cl_context context
    );

typedef void (*dnnl_graph_ocl_deallocate_f)(
    void *buf,
    cl_device_id device,
    cl_context context,
    cl_event event
    );

// global functions

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
    dnnl_graph_allocator_t* allocator,
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    );

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc
    );

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc,
    size_t size,
    const uint8_t* cache_blob
    );

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
    const_dnnl_graph_compiled_partition_t compiled_partition,
    dnnl_stream_t stream,
    size_t num_inputs,
    const_dnnl_graph_tensor_t* inputs,
    size_t num_outputs,
    const_dnnl_graph_tensor_t* outputs,
    const cl_event* deps,
    int ndeps,
    cl_event* return_event
    );

Detailed Documentation

API extensions to interact with the underlying OpenCL run-time.

Typedefs

typedef void* (*dnnl_graph_ocl_allocate_f)(
    size_t size,
    size_t alignment,
    cl_device_id device,
    cl_context context
    )

Allocation call-back function interface for OpenCL.

OpenCL allocator should be used for OpenCL GPU runtime. The call-back should return a USM device memory pointer.

Parameters:

size

Memory size in bytes for requested allocation

alignment

The minimum alignment in bytes for the requested allocation

device

A valid OpenCL device used to allocate

context

A valid OpenCL context used to allocate

Returns:

The memory address of the requested USM allocation.

typedef void (*dnnl_graph_ocl_deallocate_f)(
    void *buf,
    cl_device_id device,
    cl_context context,
    cl_event event
    )

Deallocation call-back function interface for OpenCL.

OpenCL allocator should be used for OpenCL runtime. The call-back should deallocate a USM device memory returned by dnnl_graph_ocl_allocate_f. The event should be completed before deallocate the USM.

Parameters:

buf

The USM allocation to be released

device

A valid OpenCL device the USM associated with

context

A valid OpenCL context used to free the USM allocation

event

A event which the USM deallocation depends on

Global Functions

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
    dnnl_graph_allocator_t* allocator,
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    )

Creates an allocator with the given allocation and deallocation call-back function pointers.

Parameters:

allocator

Output allocator

ocl_malloc

A pointer to OpenCL malloc function

ocl_free

A pointer to OpenCL free function

Returns:

dnnl_success on success and a status describing the error otherwise.

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc
    )

This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create( dnnl_engine_t *engine, cl_device_id device, cl_context context);.

Parameters:

engine

Output engine.

device

Underlying OpenCL device to use for the engine.

context

Underlying OpenCL context to use for the engine.

alloc

Underlying allocator to use for the engine.

Returns:

dnnl_success on success and a status describing the error otherwise.

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc,
    size_t size,
    const uint8_t* cache_blob
    )

This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create_from_cache_blob( dnnl_engine_t *engine, cl_device_id device, cl_context context, size_t size, const uint8_t *cache_blob);.

Parameters:

engine

Output engine.

device

The OpenCL device that this engine will encapsulate.

context

The OpenCL context (containing the device) that this engine will use for all operations.

alloc

Underlying allocator to use for the engine.

size

Size of the cache blob in bytes.

cache_blob

Cache blob of size size.

Returns:

dnnl_success on success and a status describing the error otherwise.

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
    const_dnnl_graph_compiled_partition_t compiled_partition,
    dnnl_stream_t stream,
    size_t num_inputs,
    const_dnnl_graph_tensor_t* inputs,
    size_t num_outputs,
    const_dnnl_graph_tensor_t* outputs,
    const cl_event* deps,
    int ndeps,
    cl_event* return_event
    )

Execute a compiled partition with OpenCL runtime.

Parameters:

compiled_partition

The handle of target compiled_partition.

stream

The stream used for execution

num_inputs

The number of input tensors

inputs

A list of input tensors

num_outputs

The number of output tensors

outputs

A non-empty list of output tensors

deps

Optional handle of list with cl_event dependencies.

ndeps

Number of dependencies.

return_event

The handle of cl_event.

Returns:

dnnl_success on success and a status describing the error otherwise.