namespace dnnl::graph::ocl_interop

Overview

OpenCL interoperability namespace. More…

namespace ocl_interop {

// global functions

allocator make_allocator(
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    );

engine make_engine_with_allocator(
    cl_device_id device,
    cl_context context,
    const allocator& alloc
    );

engine make_engine_with_allocator(
    cl_device_id device,
    cl_context context,
    const allocator& alloc,
    const std::vector<uint8_t>& cache_blob
    );

cl_event execute(
    compiled_partition& c_partition,
    stream& astream,
    const std::vector<tensor>& inputs,
    std::vector<tensor>& outputs,
    const std::vector<cl_event>& deps = {}
    );

} // namespace ocl_interop

Detailed Documentation

OpenCL interoperability namespace.

Global Functions

allocator make_allocator(
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    )

Constructs an allocator from OpenCL malloc and free function pointer.

OpenCL allocator should be used for OpenCL GPU runtime. Currently, only device USM allocator is supported.

Parameters:

ocl_malloc

The pointer to OpenCL malloc function

ocl_free

The pointer to OpenCL free function

Returns:

Created allocator

engine make_engine_with_allocator(
    cl_device_id device,
    cl_context context,
    const allocator& alloc
    )

Constructs an engine from an OpenCL device, an OpenCL context, and an allocator.

Parameters:

device

A valid OpenCL device to construct the engine

context

A valid OpenCL context to construct the engine

alloc

An allocator to associate with the engine

Returns:

Created engine

engine make_engine_with_allocator(
    cl_device_id device,
    cl_context context,
    const allocator& alloc,
    const std::vector<uint8_t>& cache_blob
    )

Constructs an engine from an OpenCL device, an OpenCL context, an allocator, and a serialized engine cache blob.

Parameters:

device

A valid OpenCL device to construct the engine

context

A valid OpenCL context to construct the engine

alloc

An allocator to associate with the engine

cache_blob

Cache blob serialized beforehand

Returns:

Created engine

cl_event execute(
    compiled_partition& c_partition,
    stream& astream,
    const std::vector<tensor>& inputs,
    std::vector<tensor>& outputs,
    const std::vector<cl_event>& deps = {}
    )

Executes a compiled partition in a specified stream and returns a OpenCL event.

Parameters:

c_partition

Compiled partition to execute.

astream

Stream object to run over

inputs

Arguments map.

outputs

Arguments map.

deps

Optional vector with cl_event dependencies.

Returns:

Output event.