==========================
SPIR-V Programming Guide
==========================
Introduction
============
`SPIR-V `__ is an open,
royalty-free, standard intermediate language capable of representing
parallel compute kernels. SPIR-V is adaptable to multiple execution
environments: a SPIR-V module is consumed by an execution environment,
as specified by a client API. This document describes the SPIR-V
execution environment for the 'oneAPI' Level-Zero API. The SPIR-V
execution environment describes required support for some SPIR-V
capabilities, additional semantics for some SPIR-V instructions, and
additional validation rules that a SPIR-V binary module must adhere to
in order to be considered valid.
This document is written for compiler developers who are generating
SPIR-V modules intended to be consumed by the 'oneAPI' Level-Zero API,
for implementors of the 'oneAPI' Level-Zero API, and for software
developers who are using SPIR-V modules with the 'oneAPI' Level-Zero
API.
Common Properties
=================
This section describes common properties of all 'oneAPI' Level-Zero
environments that consume SPIR-V modules.
A SPIR-V module is interpreted as a series of 32-bit words in host
endianness, with literal strings packed as described in the SPIR-V
specification. The first few words of the SPIR-V module must be a magic
number and a SPIR-V version number, as described in the SPIR-V
specification.
Supported SPIR-V Versions
-------------------------
The maximum SPIR-V version supported by a device is described by :ref:`ze-device-module-properties-t`\.spirvVersionSupported.
Extended Instruction Sets
-------------------------
The **OpenCL.std** `extended instruction set for
OpenCL `__ is supported.
Source Language Encoding
------------------------
The source language version is purely informational and has no semantic
meaning.
Numerical Type Formats
----------------------
Floating-point types are represented and stored using
`IEEE-754 `__ semantics.
All integer formats are represented and stored using 2's-complement
format.
Supported Types
---------------
The following types are supported. Note that some types may require
additional capabilities, and may not be supported by all environments.
Basic Scalar and Vector Types
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**OpTypeVoid** is supported.
The following scalar types are supported:
- **OpTypeBool**
- **OpTypeInt**, with *Width* equal to 8, 16, 32, or 64, and with
*Signedness* equal to zero, indicating no signedness semantics.
- **OpTypeFloat**, with *Width* equal to 16, 32, or 64.
**OpTypeVector** vector types are supported. The vector *Component Type*
may be any of the scalar types described above. Supported vector
*Component Counts* are 2, 3, 4, 8, or 16.
**OpTypeArray** array types are supported, **OpTypeStruct** struct types
are supported, **OpTypeFunction** functions are supported, and
**OpTypePointer** pointer types are supported.
Image-Related Data Types
~~~~~~~~~~~~~~~~~~~~~~~~
The following table describes the supported **OpTypeImage** image types:
========== ======= ========= =======================
*Dim* *Depth* *Arrayed* **Description**
========== ======= ========= =======================
**1D** ``0`` ``0`` A 1D image.
**1D** ``0`` ``1`` A 1D image array.
**2D** ``0`` ``0`` A 2D image.
**2D** ``1`` ``0`` A 2D depth image.
**2D** ``0`` ``1`` A 2D image array.
**2D** ``1`` ``1`` A 2D depth image array.
**3D** ``0`` ``0`` A 3D image.
**Buffer** ``0`` ``0`` A 1D buffer image.
========== ======= ========= =======================
**OpTypeSampler** sampler typed are supported.
Kernels
-------
An **OpFunction** in a SPIR-V module that is identified with
**OpEntryPoint** defines a kernel that may be launched using host API
interfaces.
Kernel Return Types
-------------------
The *Result Type* for an **OpFunction** identified with **OpEntryPoint**
must be **OpTypeVoid**.
Kernel Arguments
----------------
An **OpFunctionParameter** for an **OpFunction** that is identified with
**OpEntryPoint** defines a kernel argument. Allowed types for kernel
arguments are:
- **OpTypeInt**
- **OpTypeFloat**
- **OpTypeStruct**
- **OpTypeVector**
- **OpTypePointer**
- **OpTypeSampler**
- **OpTypeImage**
For **OpTypeInt** parameters, supported *Widths* are 8, 16, 32, and 64,
and must have no signedness semantics.
For **OpTypeFloat** parameters, supported *Widths* are 16 and 32.
For **OpTypeStruct** parameters, supported structure *Member Types* are:
- **OpTypeInt**
- **OpTypeFloat**
- **OpTypeStruct**
- **OpTypeVector**
- **OpTypePointer**
For **OpTypePointer** parameters, supported *Storage Classes* are:
- **CrossWorkgroup**
- **Workgroup**
- **UniformConstant**
Environments that support extensions or optional features may allow
additional types in an entry point's parameter list.
Required Capabilities
=====================
SPIR-V 1.0
----------
An environment that supports SPIR-V 1.0 must support SPIR-V 1.0 modules
that declare the following capabilities:
- **Addresses**
- **Float16Buffer**
- **Int64**
- **Int16**
- **Int8**
- **Kernel**
- **Linkage**
- **Vector16**
- **GenericPointer**
- **Groups**
- **ImageBasic** (for devices supporting :ref:`ze-device-image-properties-t`\.supported)
- **Float16** (for devices supporting :ref:`ZE_DEVICE_MODULE_FLAG_FP16 `\)
- **Float64** (for devices supporting :ref:`ZE_DEVICE_MODULE_FLAG_FP64 `\)
- **Int64Atomics** (for devices supporting :ref:`ZE_DEVICE_MODULE_FLAG_INT64_ATOMICS `\)
If the 'oneAPI' environment supports the **ImageBasic** capability,
then the following capabilities must also be supported:
- **LiteralSampler**
- **Sampled1D**
- **Image1D**
- **SampledBuffer**
- **ImageBuffer**
- **ImageReadWrite**
SPIR-V 1.1
----------
An environment supporting SPIR-V 1.1 must support SPIR-V 1.1 modules
that declare the capabilities required for SPIR-V 1.0 modules, above.
SPIR-V 1.1 does not add any new required capabilities.
SPIR-V 1.2
----------
An environment supporting SPIR-V 1.2 must support SPIR-V 1.2 modules
that declare the capabilities required for SPIR-V 1.1 modules, above.
SPIR-V 1.2 does not add any new required capabilities.
Validation Rules
================
The following are a list of validation rules that apply to SPIR-V
modules executing in all 'oneAPI' Level-Zero environments:
The *Execution Model* declared in **OpEntryPoint** must be **Kernel**.
The *Addressing Model* declared in **OpMemoryModel** must
**Physical64**, indicating that device pointers are 64-bits.
The *Memory Model* declared in **OpMemoryModel** must be **OpenCL**.
For all **OpTypeInt** integer type-declaration instructions:
- *Signedness* must be 0, indicating no signedness semantics.
For all **OpTypeImage** type-declaration instructions: \* *Sampled Type*
must be **OpTypeVoid**. \* *Sampled* must be 0, indicating that the
image usage will be known at run time, not at compile time. \* *MS* must
be 0, indicating single-sampled content. \* *Arrayed* may only be set to
1, indicating arrayed content, when *Dim* is set to **1D** or **2D**. \*
*Image Format* must be **Unknown**, indicating that the image does not
have a specified format. \* The optional image *Access Qualifier* must
be present.
The image write instruction **OpImageWrite** must not include any
optional *Image Operands*.
The image read instructions **OpImageRead** and
**OpImageSampleExplicitLod** must not include the optional *Image
Operand* **ConstOffset**.
For all *Atomic Instructions*:
- 32-bit integer types are supported for the *Result Type* and/or type
of *Value*. 64-bit integer types are optionally supported for the
*Result Type* and/or type of *Value* for devices supporting
:ref:`ZE_DEVICE_MODULE_FLAG_INT64_ATOMICS `\.
- The *Pointer* operand must be a pointer to the **Function**,
**Workgroup**, **CrossWorkGroup**, or **Generic** *Storage Classes*.
Recursion is not supported. The static function call graph for an entry
point must not contain cycles.
Whether irreducible control flow is legal is implementation defined.
For the instructions **OpGroupAsyncCopy** and **OpGroupWaitEvents**,
*Scope* for *Execution* must be:
- **Workgroup**
For all other instructions, *Scope* for *Execution* must be one of:
- **Workgroup**
- **Subgroup**
*Scope* for *Memory* must be one of:
- **CrossDevice**
- **Device**
- **Workgroup**
- **Invocation**
- **Subgroup**
Extensions
==========
Intel Subgroups
---------------
'oneAPI' Level-Zero API environments must accept SPIR-V modules that
declare use of the ``SPV_INTEL_subgroups`` extension via
**OpExtension**.
When use of the ``SPV_INTEL_subgroups`` extension is declared in the
module via **OpExtension**, the environment must accept modules that
declare the following SPIR-V capabilities:
- **SubgroupShuffleINTEL**
- **SubgroupBufferBlockIOINTEL**
- **SubgroupImageBlockIOINTEL**
The environment must accept the following types for *Data* for the
**SubgroupShuffleINTEL** instructions:
- Scalars and **OpTypeVectors** with 2, 4, 8, or 16 *Component Count*
components of the following *Component Type* types:
- **OpTypeFloat** with a *Width* of 32 bits (``float``)
- **OpTypeInt** with a *Width* of 8 bits and *Signedness* of 0
(``char`` and ``uchar``)
- **OpTypeInt** with a *Width* of 16 bits and *Signedness* of 0
(``short`` and ``ushort``)
- **OpTypeInt** with a *Width* of 32 bits and *Signedness* of 0
(``int`` and ``uint``)
- Scalars of **OpTypeInt** with a *Width* of 64 bits and *Signedness*
of 0 (``long`` and ``ulong``)
Additionally, if the **Float16** capability is declared and supported:
- Scalars of **OpTypeFloat** with a *Width* of 16 bits (``half``)
Additionally, if the **Float64** capability is declared and supported:
- Scalars of **OpTypeFloat** with a *Width* of 64 bits (``double``)
The environment must accept the following types for *Result* and *Data*
for the **SubgroupBufferBlockIOINTEL** and **SubgroupImageBlockIOINTEL**
instructions:
- Scalars and **OpTypeVectors** with 2, 4, or 8 *Component Count*
components of the following *Component Type* types:
- **OpTypeInt** with a *Width* of 32 bits and *Signedness* of 0
(``int`` and ``uint``)
- **OpTypeInt** with a *Width* of 16 bits and *Signedness* of 0
(``short`` and ``ushort``)
For *Ptr*, valid *Storage Classes* are:
- **CrossWorkGroup** (``global``)
For *Image*:
- *Dim* must be *2D*
- *Depth* must be 0 (not a depth image)
- *Arrayed* must be 0 (non-arrayed content)
- *MS* must be 0 (single-sampled content)
For *Coordinate*, the following types are supported:
- **OpTypeVectors** with two *Component Count* components of *Component
Type* **OpTypeInt** with a *Width* of 32 bits and *Signedness* of 0
(``int2``)
Notes and Restrictions
~~~~~~~~~~~~~~~~~~~~~~
The **SubgroupShuffleINTEL** instructions may be placed within
non-uniform control flow and hence do not have to be encountered by all
invocations in the subgroup, however *Data* may only be shuffled among
invocations encountering the **SubgroupShuffleINTEL** instruction.
Shuffling *Data* from an invocation that does not encounter the
**SubgroupShuffleINTEL** instruction will produce undefined results.
There is no defined behavior for out-of-range shuffle indices for the
**SubgroupShuffleINTEL** instructions.
The **SubgroupBufferBlockIOINTEL** and **SubgroupImageBlockIOINTEL**
instructions are only guaranteed to work correctly if placed strictly
within uniform control flow within the subgroup. This ensures that if
any invocation executes it, all invocations will execute it. If placed
elsewhere, behavior is undefined.
There is no defined out-of-range behavior for the
**SubgroupBufferBlockIOINTEL** instructions.
The **SubgroupImageBlockIOINTEL** instructions do support bounds
checking, however they bounds-check to the image width in units of
``uints``, not in units of image elements. This means:
- If the image has an *Image Format* size equal to the size of a
``uint`` (four bytes, for example **Rgba8**), the image will be
correctly bounds-checked. In this case, out-of-bounds reads will
return the edge image element (the equivalent of **ClampToEdge**),
and out-of-bounds writes will be ignored.
- If the image has an *Image Format* size less than the size of a
``uint`` (such as **R8**), the entire image is addressable, however
bounds checking will occur too late. For this reason, extra care
should be taken to avoid out-of-bounds reads and writes, since
out-of-bounds reads may return invalid data and out-of-bounds writes
may corrupt other images or buffers unpredictably.
The following restrictions apply to the **SubgroupBufferBlockIOINTEL**
instructions:
- The pointer *Ptr* must be 32-bit (4-byte) aligned for reads, and must
be 128-bit (16-byte) aligned for writes.
The following restrictions apply to the **SubgroupImageBlockIOINTEL**
instructions:
- The behavior of the **SubgroupImageBlockIOINTEL** instructions is
undefined for images with an element size greater than four bytes
(such as **Rgba32f**).
The following restrictions apply to the
**OpSubgroupImageBlockWriteINTEL** instruction:
- Unlike the image block read instruction, which may read from any
arbitrary byte offset, the x-component of the byte coordinate for the
image block write instruction must be a multiple of four; in other
words, the write must begin at a 32-bit boundary. There is no
restriction on the y-component of the coordinate.
Floating-Point Atomics
----------------------
'oneAPI' Level-Zero API environments supporting the extension
**ZE_extension_float_atomics** must support additional atomic instructions,
capabilities, and types.
Atomic Load, Store, and Exchange
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If the 'oneAPI' Level-Zero API environment supports the extension **ZE_extension_float_atomics** and :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_LOAD_STORE ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_LOAD_STORE `\, then for the **Atomic Instructions** **OpAtomicLoad**, **OpAtomicStore**, and **OpAtomicExchange**:
- 16-bit floating-point types are supported for the *Result Type* and type of
*Value*.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_LOAD_STORE `\, the *Pointer* operand may be a pointer to the **CrossWorkGroup** *Storage Class*.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_LOAD_STORE `\, the *Pointer* operand may be a pointer to the **Workgroup** *Storage Class*.
Atomic Add and Subtract
~~~~~~~~~~~~~~~~~~~~~~~
If the 'oneAPI' Level-Zero API environment supports the extension **ZE_extension_float_atomics** and :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags, :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags, or :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags include :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD `\, then the environment must accept modules that declare use of the extensions ``SPV_EXT_shader_atomic_float_add`` and ``SPV_EXT_shader_atomic_float16_add``.
Additionally:
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD `\, the **AtomicFloat16AddEXT** capability must be supported.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD `\, the **AtomicFloat32AddEXT** capability must be supported.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD `\, the **AtomicFloat64AddEXT** capability must be supported.
- For the **Atomic Instruction** **OpAtomicFAddEXT** added by these extensions:
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags, :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags, or :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_ADD `\, the *Pointer* operand may be a pointer to the **CrossWorkGroup** *Storage Class*.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags, :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags, or :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_ADD `\, the *Pointer* operand may be a pointer to the **Workgroup** *Storage Class*.
Atomic Min and Max
~~~~~~~~~~~~~~~~~~
If the 'oneAPI' Level-Zero API environment supports the extension **ZE_extension_float_atomics** and the :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags, :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags, or :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags bitfields include :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_MIN_MAX ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_MIN_MAX `\, then the environment must accept modules that declare use of the extension ``SPV_EXT_shader_atomic_float_min_max``.
Additionally:
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_MIN_MAX ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_MIN_MAX `\, the **AtomicFloat32MinMaxEXT** capability must be supported.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_MIN_MAX ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_MIN_MAX `\, the **AtomicFloat64MinMaxEXT** capability must be supported.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_MIN_MAX ` or :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_MIN_MAX `\, the **AtomicFloat16MinMaxEXT** capability must be supported.
- For the **Atomic Instructions** **OpAtomicFMinEXT** and **OpAtomicFMaxEXT** added by this extension:
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags, :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags, or :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_GLOBAL_MIN_MAX ` , the *Pointer* operand may be a pointer to the **CrossWorkGroup** *Storage Class*.
- When :ref:`ze-device-fp-atomic-ext-flags-t`\.fp16Flags, :ref:`ze-device-fp-atomic-ext-flags-t`\.fp32Flags, or :ref:`ze-device-fp-atomic-ext-flags-t`\.fp64Flags includes :ref:`ZE_DEVICE_FP_ATOMIC_EXT_FLAG_LOCAL_MIN_MAX `\, the *Pointer* operand may be a pointer to the **Workgroup** *Storage Class*.
Extended Subgroups
------------------
'oneAPI' Level-Zero API environments supporting the extension
**ZE_extension_subgroups** must support additional subgroup instructions,
capabilities, and types.
Extended Types
~~~~~~~~~~~~~~
The following Groups instructions must be supported with *Scope* for *Execution*
equal to **Subgroup**:
- **OpGroupBroadcast**
- **OpGroupIAdd**, **OpGroupFAdd**
- **OpGroupSMin**, **OpGroupUMin**, **OpGroupFMin**
- **OpGroupSMax**, **OpGroupUMax**, **OpGroupFMax**
For these instructions, valid types for *Value* are:
- Scalars of supported types:
- **OpTypeInt** (equivalent to ``char``, ``uchar``, ``short``, ``ushort``,
``int``, ``uint``, ``long``, and ``ulong``)
- **OpTypeFloat** (equivalent to ``half``, ``float``, and ``double``)
Additionally, for **OpGroupBroadcast**, valid types for *Value* are:
- **OpTypeVectors** with 2, 3, 4, 8, or 16 Component Count components of
supported types:
- **OpTypeInt** (equivalent to ``charn``, ``ucharn``, ``shortn``,
``ushortn``, ``intn``, ``uintn``, ``longn``, and ``ulongn``)
- **OpTypeFloat** (equivalent to ``halfn``, ``floatn``, and ``doublen``)
Vote
~~~~
The following capabilities must be supported:
- **GroupNonUniform**
- **GroupNonUniformVote**
For instructions requiring these capabilities, *Scope* for *Execution* may be:
- **Subgroup**
For the instruction **OpGroupNonUniformAllEqual**, valid types for *Value* are:
- Scalars of supported types:
- **OpTypeInt** (equivalent to ``char``, ``uchar``, ``short``, ``ushort``,
``int``, ``uint``, ``long``, and ``ulong``)
- **OpTypeFloat** (equivalent to ``half``, ``float``, and ``double``)
Ballot
~~~~~~
The following capabilities must be supported:
- **GroupNonUniformBallot**
For instructions requiring these capabilities, *Scope* for *Execution* may be:
- **Subgroup**
For the non-uniform broadcast instruction **OpGroupNonUniformBroadcast**, valid
types for *Value* are:
- Scalars of supported types:
- **OpTypeInt** (equivalent to ``char``, ``uchar``, ``short``, ``ushort``,
``int``, ``uint``, ``long``, and ``ulong``)
- **OpTypeFloat** (equivalent to ``half``, ``float``, and ``double``)
- **OpTypeVectors** with 2, 3, 4, 8, or 16 Component Count components of
supported types:
- **OpTypeInt** (equivalent to ``charn``, ``ucharn``, ``shortn``,
``ushortn``, ``intn``, ``uintn``, ``longn``, and ``ulongn``)
- **OpTypeFloat** (equivalent to ``halfn``, ``floatn``, and ``doublen``)
For the instruction **OpGroupNonUniformBroadcastFirst**, valid types for *Value* are:
- Scalars of supported types:
- **OpTypeInt** (equivalent to ``char``, ``uchar``, ``short``, ``ushort``,
``int``, ``uint``, ``long``, and ``ulong``)
- **OpTypeFloat** (equivalent to ``half``, ``float``, and ``double``)
For the instruction **OpGroupNonUniformBallot**, the valid Result Type is an
OpTypeVector with four Component Count components of **OpTypeInt**, with *Width*
equal to 32 and *Signedness* equal to 0 (equivalent to ``uint4``).
For the instructions **OpGroupNonUniformInverseBallot**,
**OpGroupNonUniformBallotBitExtract**, **OpGroupNonUniformBallotBitCount**,
**OpGroupNonUniformBallotFindLSB**, and **OpGroupNonUniformBallotFindMSB**, the
valid type for *Value* is an **OpTypeVector** with four *Component Count*
components of **OpTypeInt**, with *Width* equal to 32 and *Signedness* equal to
0 (equivalent to uint4).
For built-in variables decorated with **SubgroupEqMask**, **SubgroupGeMask**,
**SubgroupGtMask**, **SubgroupLeMask**, or **SubgroupLtMask**, the supported
variable type is an **OpTypeVector** with four *Component Count* components of
**OpTypeInt**, with *Width* equal to 32 and *Signedness* equal to 0 (equivalent
to ``uint4``).
Non-Uniform Arithmetic
~~~~~~~~~~~~~~~~~~~~~~
The following capabilities must be supported:
- **GroupNonUniformArithmetic**
For instructions requiring these capabilities, *Scope* for *Execution* may be:
- **Subgroup**
For the instructions **OpGroupNonUniformLogicalAnd**,
**OpGroupNonUniformLogicalOr**, and **OpGroupNonUniformLogicalXor**, the valid
type for *Value* is **OpTypeBool**.
Otherwise, for the **GroupNonUniformArithmetic** scan and reduction
instructions, valid types for *Value* are:
- Scalars of supported types:
- **OpTypeInt** (equivalent to ``char``, ``uchar``, ``short``, ``ushort``,
``int``, ``uint``, ``long``, and ``ulong``)
- **OpTypeFloat** (equivalent to ``half``, ``float``, and ``double``)
For the **GroupNonUniformArithmetic** scan and reduction instructions, the
optional *ClusterSize* operand must not be present.
Shuffles
~~~~~~~~
The following capabilities must be supported:
- **GroupNonUniformShuffle**
For instructions requiring these capabilities, *Scope* for *Execution* may be:
- **Subgroup**
For the instructions **OpGroupNonUniformShuffle** and
**OpGroupNonUniformShuffleXor** requiring these capabilities, valid types for
*Value* are:
- Scalars of supported types:
- **OpTypeInt** (equivalent to ``char``, ``uchar``, ``short``, ``ushort``,
``int``, ``uint``, ``long``, and ``ulong``)
- **OpTypeFloat** (equivalent to ``half``, ``float``, and ``double``)
Relative Shuffles
~~~~~~~~~~~~~~~~~
The following capabilities must be supported:
- **GroupNonUniformShuffleRelative**
For instructions requiring these capabilities, *Scope* for *Execution* may be:
- **Subgroup**
For the **GroupNonUniformShuffleRelative** instructions, valid types for *Value*
are:
- Scalars of supported types:
- **OpTypeInt** (equivalent to ``char``, ``uchar``, ``short``, ``ushort``,
``int``, ``uint``, ``long``, and ``ulong``)
- **OpTypeFloat** (equivalent to ``half``, ``float``, and ``double``)
Clustered Reductions
~~~~~~~~~~~~~~~~~~~~
The following capabilities must be supported:
- **GroupNonUniformClustered**
For instructions requiring these capabilities, *Scope* for *Execution* may be:
- **Subgroup**
When the **GroupNonUniformClustered** capability is declared, the
**GroupNonUniformArithmetic** scan and reduction instructions may include the
optional *ClusterSize* operand.
Linkonce ODR
------------
'oneAPI' Level-Zero API environments supporting the extension
**ZE_extension_linkonce_odr** must must accept SPIR-V modules that
declare use of the ``SPV_KHR_linkonce_odr`` extension via
**OpExtension**.
When use of the ``SPV_KHR_linkonce_odr`` extension is declared in the
module via **OpExtension**, the environment must accept modules that
include the **LinkOnceODR** linkage type.
Bfloat16 Conversions
--------------------
'oneAPI' Level-Zero API environments supporting the extension
**ZE_extension_bfloat16_conversions** must must accept SPIR-V modules that
declare use of the ``SPV_INTEL_bloat16_conversion`` extension via
**OpExtension**.
When use of the ``SPV_INTEL_bloat16_conversion`` extension is declared in the
module via **OpExtension**, the environment must accept modules that
declare the **Bfloat16ConversionINTEL** capability.
For the instructions **OpConvertFToBF16INTEL** and **OpConvertBF16ToFINTEL**
added by the extension:
- Valid types for *Result Type*, *Float Value*, and *Bfloat16 Value* are Scalars
and **OpTypeVectors** with 2, 3, 4, 8, or 16 *Component Count* components
Numerical Compliance
====================
The 'oneAPI' Level-Zero environment will meet or exceed the numerical
compliance requirements defined in the OpenCL SPIR-V Environment
Specification. See: `Numerical
Compliance `__.
Image Addressing and Filtering
==============================
The 'oneAPI' Level-Zero environment image addressing and filtering
behavior is compatible with the behavior defined in the OpenCL SPIR-V
Environment Specification. See: `Image Addressing and
Filtering `__.