Level Zero UR Reference Document#

This document gives general guidelines on differences in the UR L0 adapter for customer usecases.

Environment Variables#

Environment Variable

Description

Possible Values

Default Value

UR_L0_USE_COPY_ENGINE

Controls the use of copy engines.

“0”: Copy engines will not be used. “1”: All available copy engines can be used. “lower_index:upper_index”: Specifies a range of copy engines to be used.

“1”

UR_L0_USE_IMMEDIATE_COMMANDLISTS

Determines the mode of immediate command lists.

“0”: Immediate command lists are not used. “1”: Immediate command lists are used per queue. “2”: Immediate command lists are used per thread per queue.

“0”

UR_L0_USE_RELAXED_ALLOCATION_LIMITS

Controls the use of relaxed allocation limits.

“0”: Relaxed allocation limits are not used. “1”: Relaxed allocation limits are used.

“0”

UR_L0_USE_DRIVER_INORDER_LISTS

Controls the use of in-order lists from the driver.

“0”: In-order lists from the driver are not used. “1”: In-order lists from the driver are used.

“0”

UR_L0_USM_ALLOCATOR_TRACE

Enables tracing for the USM allocator.

“0”: Tracing is disabled. “1”: Tracing is enabled.

“0”

UR_L0_USM_ALLOCATOR

Configures the USM allocator.

Specifies the configuration for the USM allocator.

All Configs

UR_L0_DEBUG_BASIC

Enables basic debugging for Level Zero.

“0”: Debugging is disabled. “1”: Debugging is enabled.

“0”

UR_L0_ENABLE_SYSMAN_ENV_DEFAULT

Controls the default SysMan environment initialization.

“1” or unset: Enables SysMan environment initialization. “0”: Disables SysMan environment initialization.

“1”

UR_L0_ENABLE_ZESINIT_DEFAULT

Controls the default SysMan initialization with zesInit.

“1”: Enables SysMan initialization with zesInit. “0” or unset: Disables SysMan initialization with zesInit.

“0”

SYCL_ENABLE_PCI

Deprecated and no longer needed.

Any value: Triggers a warning message.

None

UR_L0_USE_COPY_ENGINE_FOR_FILL

Controls the use of copy engines for memory fill operations.

“0”: Copy engines will not be used for fill operations. “1”: Copy engines will be used for fill operations.

“0”

UR_L0_DEBUG

Controls the debug level for Level Zero.

“0”: No debug information. “1”: Basic debug information. “2”: Validation debug information. “-1”: All debug information.

“0”

UR_L0_LEAKS_DEBUG

Enables debugging for memory leaks.

“0”: Memory leaks debugging is disabled. “1”: Memory leaks debugging is enabled.

“0”

UR_L0_INIT_ALL_DRIVERS

Controls the initialization of all Level Zero drivers.

“0”: Only currently used drivers are initialized. “1”: All drivers on the system are initialized.

“0”

UR_L0_SERIALIZE

Controls serialization of Level Zero calls.

“0”: No locking or blocking. “1”: Locking around each UR_CALL. “2”: Blocking UR calls where supported.

“0”

UR_L0_QUEUE_SYNCHRONIZE_NON_BLOCKING

Controls non-blocking synchronization of queues.

“0”: Non-blocking synchronization is disabled. “1”: Non-blocking synchronization is enabled.

“0”

UR_L0_OOQ_INTEGRATED_SIGNAL_EVENT

Controls signal events for commands on integrated GPUs.

“0”: Signal events are not created. “1”: Signal events are created.

“0”

UR_L0_TRACK_INDIRECT_ACCESS_MEMORY

Enables tracking of indirect access memory.

“0”: Tracking is disabled. “1”: Tracking is enabled.

“0”

UR_L0_EXPOSE_CSLICE_IN_AFFINITY_PARTITIONING

Controls exposure of CSlice in affinity partitioning.

“0”: CSlice is not exposed. “1”: CSlice is exposed.

“0”

UR_L0_MAX_NUMBER_OF_EVENTS_PER_EVENT_POOL

Sets the maximum number of events per event pool.

Any positive integer: Specifies the maximum number of events per event pool.

256

UR_L0_COMMANDLISTS_CLEANUP_THRESHOLD

Sets the threshold for command lists cleanup.

Any positive integer: Specifies the threshold for cleanup. Negative value: Disables the threshold.

20

UR_L0_USE_NATIVE_USM_MEMCPY2D

Controls the use of native USM memcpy2D operations.

“0”: Native USM memcpy2D operations are not used. “1”: Native USM memcpy2D operations are used.

“0”

UR_L0_ENABLE_USM_HOSTPTR_IMPORT

Enables USM host pointer import.

“0”: USM host pointer import is disabled. “1”: USM host pointer import is enabled.

“0”

UR_L0_ENABLE_USM_HOSTPTR_RELEASE

Enables USM host pointer release.

“0”: USM host pointer release is disabled. “1”: USM host pointer release is enabled.

“0”

UR_L0_ENABLE_USM_HOST_UNIFIED_MEMORY

Enables USM host unified memory.

“0”: USM host unified memory is disabled. “1”: USM host unified memory is enabled.

“0”

UR_L0_USE_MULTIPLE_COMMANDLIST_BARRIERS

Controls the use of multiple command lists for barriers.

“0”: Multiple command lists are not used. “1”: Multiple command lists are used.

“0”

UR_L0_IN_ORDER_BARRIER_BY_SIGNAL

Controls if in-order barriers are implemented by signal.

“0”: Barriers are implemented by true barrier command. “1”: Barriers are implemented by signal.

“0”

UR_L0_DISABLE_EVENTS_CACHING

Controls the caching of events in the context.

“0” or unset: Event caching is enabled. “1”: Event caching is disabled.

“0”

UR_L0_REUSE_DISCARDED_EVENTS

Controls the reuse of uncompleted events in in-order queues.

“0”: Reuse of discarded events is disabled. “1” or unset: Reuse of discarded events is enabled.

“1”

SYCL_PI_LEVEL_ZERO_FILTER_EVENT_WAIT_LIST

Controls filtering of event wait lists.

“0” or unset: Filtering is disabled. “1”: Filtering is enabled.

“0”

UR_L0_DEVICE_SCOPE_EVENTS

Controls the scope of device events.

“0”: All events are host-visible. “1”: On-demand host-visible proxy events. “2”: Last command in batch host-visible.

“0”

UR_L0_USE_COPY_ENGINE_FOR_D2D_COPY

Controls the use of copy engines for device-to-device copy operations.

“0”: Copy engines will not be used for D2D copy operations. “1”: Copy engines will be used for D2D copy operations.

“0”

UR_L0_BATCH_SIZE

Controls the batch size for command lists.

“0”: Dynamic batch size adjustment. Any positive integer: Specifies the fixed batch size.

“0”

UR_L0_COPY_BATCH_SIZE

Controls the batch size for copy command lists.

“0”: Dynamic batch size adjustment. Any positive integer: Specifies the fixed batch size.

“0”

UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_MAX

Sets the maximum number of immediate command lists batches.

Any positive integer: Specifies the maximum number of batches

10

UR_L0_IMMEDIATE_COMMANDLISTS_EVENTS_PER_BATCH

Sets the number of events per batch for immediate command lists.

Any positive integer: Specifies the number of events per batch.

256

UR_L0_USE_COMPUTE_ENGINE

Controls the use of compute engines.

“0”: Only the first compute engine is used. Any positive integer: Specifies the index of the compute engine to be used. Negative value: All available compute engines may be used.

“0”

UR_L0_USM_RESIDENT

Controls memory residency for USM allocations.

“0xHSD”: Specifies residency for host, shared, and device allocations.

0x2

UR_L0_DISABLE_USM_ALLOCATOR

Controls the use of the USM allocator.

“0”: USM allocator is enabled. Any other value: USM allocator is disabled.

“0”

UR_L0_CMD_BUFFER_USE_IMMEDIATE_APPEND_PATH

Controls which command-buffer implementation path is used. The paths rely on different APIs to enqueue command-buffers. The immediate append path relies on zeCommandListImmediateAppendCommandListsExp(). The wait-event path relies on zeCommandQueueExecuteCommandLists()

“1”: the immediate append path will always be enabled as long as the pre-requisites are met. “0”: the immediate append path will always be disabled. “unset”: The default behaviour will be used which enables the immediate append path only for some devices when the pre-requisites are met.

Unset

UR_L0_VECTOR_WIDTH_SIZE

Specifies the size (in bits) of the vector width supported by the Level Zero device. This value indicates the maximum number of data elements that can be processed simultaneously in a single instruction, which is useful for optimizing data-parallel workloads and understanding device caps.

Any positive integer: Indicates the maximum number of data elements that can be processed simultaneously in a single instruction. The value entered by user user is the desired width size to config. If this width size is not supported, then the default “max” will be used.

Device-specific

Contributors#