

Experimental features:

  • May be replaced, updated, or removed at any time.

  • Do not require maintaining API/ABI stability of their own additions over time.

  • Do not require conformance testing of their own additions.


A command-buffer represents a series of commands for execution on a command queue. Many adapters support this kind of construct either natively or through extensions, but they are not available to use directly. Typically their use is abstracted through the existing Core APIs, for example when calling urEnqueueKernelLaunch the adapter may both append the kernel command to a command-buffer-like construct and also submit that command-buffer to a queue for execution. These types of structures allow for batching of commands to improve host launch latency, but without direct control it falls to the adapter implementation to implement automatic batching of commands.

This experimental feature exposes command-buffers in the Unified Runtime API directly, allowing applications explicit control over the enqueue and execution of commands to batch commands as required for optimal performance.

Querying Command-Buffer Support#

Support for command-buffers can be queried for a given device/adapter by using the device info query with UR_DEVICE_INFO_EXTENSIONS. Adapters supporting this experimental feature will report the string “ur_exp_command_buffer” in the returned list of supported extensions.


The macro UR_COMMAND_BUFFER_EXTENSION_STRING_EXP is defined for the string returned from extension queries for this feature. Since the actual string may be subject to change it is safer to use this macro when querying for support for this experimental feature.

// Retrieve length of extension string
size_t returnedSize;
urDeviceGetInfo(hDevice, UR_DEVICE_INFO_EXTENSIONS, 0, nullptr,&returnedSize);

// Retrieve extension string
std::unique_ptr<char[]> returnedExtensions(new char[returnedSize]);
urDeviceGetInfo(hDevice, UR_DEVICE_INFO_EXTENSIONS, returnedSize,returnedExtensions.get(), nullptr);

std::string_view ExtensionsString(returnedExtensions.get());
bool CmdBufferSupport =
        != std::string::npos;


The UR_DEVICE_INFO_COMMAND_BUFFER_SUPPORT_EXP device info query exists to serve the same purpose as UR_COMMAND_BUFFER_EXTENSION_STRING_EXP.

Command-Buffer Creation#

Command-Buffers are tied to a specific ur_context_handle_t and ur_device_handle_t. urCommandBufferCreateExp optionally takes a descriptor to provide additional properties for how the command-buffer should be constructed. The members defined in ur_exp_command_buffer_desc_t are: * isUpdatable, which should be set to true to support updating command-buffer commands. * isInOrder, which should be set to true to enable commands enqueued to a command-buffer to be executed in an in-order fashion where possible. * enableProfiling, which should be set to true to enable profiling of the command-buffer.

Command-buffers are reference counted and can be retained and released by calling urCommandBufferRetainExp and urCommandBufferReleaseExp respectively.

Appending Commands#

Commands can be appended to a command-buffer by calling any of the command-buffer append functions. Typically these closely mimic the existing enqueue functions in the Core API in terms of their command-specific parameters. However, they differ in that they take a command-buffer handle instead of a queue handle, and the dependencies and return parameters are sync-points instead of event handles.

The entry-point for appending a kernel launch command also returns an optional handle to the command being appended. This handle can be used to update the command configuration between command-buffer executions, see the section on updating command-buffer commands.

Currently only the following commands are supported:

It is planned to eventually support any command type from the Core API which can actually be appended to the equivalent adapter native constructs.


A sync-point is a value which represents a command inside of a command-buffer which is returned from command-buffer append function calls. These can be optionally passed to these functions to define execution dependencies on other commands within the command-buffer. Sync-points passed to functions may be ignored if the command-buffer was created in-order.

Sync-points are unique and valid for use only within the command-buffer they were obtained from.

// Append a memcpy with no sync-point dependencies
ur_exp_command_buffer_sync_point_t syncPoint;

urCommandBufferAppendUSMMemcpyExp(hCommandBuffer, pDst, pSrc, size, 0,nullptr, &syncPoint);

// Append a kernel launch with syncPoint as a dependency, ignore returned
// sync-point
urCommandBufferAppendKernelLaunchExp(hCommandBuffer, hKernel, workDim,pGlobalWorkOffset, pGlobalWorkSize,pLocalWorkSize, 1, &syncPoint,nullptr, nullptr);

Enqueueing Command-Buffers#

Command-buffers are submitted for execution on a ur_queue_handle_t with an optional list of dependent events. An event is returned which tracks the execution of the command-buffer, and will be complete when all appended commands have finished executing. It is adapter specific whether command-buffers can be enqueued or executed simultaneously, and submissions may be serialized.

ur_event_handle_t executionEvent;

urCommandBufferEnqueueExp(hCommandBuffer, hQueue, 0, nullptr,&executionEvent);

Updating Command-Buffer Commands#

An adapter implementing the command-buffer experimental feature can optionally support updating the configuration of kernel commands recorded to a command-buffer. Support for this is reported by returning true in the UR_DEVICE_INFO_COMMAND_BUFFER_UPDATE_SUPPORT_EXP query.

Updating kernel commands is done by passing the new kernel configuration to urCommandBufferUpdateKernelLaunchExp along with the command handle of the kernel command to update. Configurations that can be changed are the parameters to the kernel and the execution ND-Range.

// Create a command-buffer with update enabled.
ur_exp_command_buffer_desc_t desc {
  true // isUpdatable
ur_exp_command_buffer_handle_t hCommandBuffer;
urCommandBufferCreateExp(hContext, hDevice, &desc, &hCommandBuffer);

// Append a kernel command which has two buffer parameters, an input
// and an output.
ur_exp_command_buffer_command_handle_t hCommand;
urCommandBufferAppendKernelLaunchExp(hCommandBuffer, hKernel, workDim,pGlobalWorkOffset, pGlobalWorkSize,pLocalWorkSize, 0, nullptr,nullptr, &hCommand);

// Close the command-buffer before updating

// Define kernel argument at index 0 to be a new input buffer object
ur_exp_command_buffer_update_memobj_arg_desc_t newInputArg {
    nullptr, // pNext
    0, // argIndex
    nullptr, // pProperties
    newInputBuffer, // hNewMemObjArg

// Define kernel argument at index 1 to be a new output buffer object
ur_exp_command_buffer_update_memobj_arg_desc_t newOutputArg {
    nullptr, // pNext
    1, // argIndex
    nullptr, // pProperties
    newOutputBuffer, // hNewMemObjArg

// Define the new configuration of the kernel command
ur_exp_command_buffer_update_memobj_arg_desc_t updatedArgs[2] = {newInputArg, newOutputArg};
ur_exp_command_buffer_update_kernel_launch_desc_t update {
    nullptr, // pNext
    2, // numNewMemobjArgs
    0, // numNewPointerArgs
    0, // numNewValueArgs
    0, // numNewExecInfos
    0, // newWorkDim
    new_args, // pNewMemObjArgList
    nullptr, // pNewPointerArgList
    nullptr, // pNewValueArgList
    nullptr, // pNewExecInfoList
    nullptr, // pNewGlobalWorkOffset
    nullptr, // pNewGlobalWorkSize
    nullptr, // pNewLocalWorkSize

// Perform the update
urCommandBufferUpdateKernelLaunchExp(hCommand, &update);









Initial Draft


Add function definitions for buffer read and write


Add function definitions for fill commands


Add function definitions for Prefetch and Advise commands


Add function definitions for kernel command update
