SPIR-V Programming Guide¶
Introduction¶
SPIR-V is an open, royalty-free, standard intermediate language capable of representing parallel compute kernels. SPIR-V is adaptable to multiple execution environments: a SPIR-V module is consumed by an execution environment, as specified by a client API. This document describes the SPIR-V execution environment for the ‘oneAPI’ Level-Zero API. The SPIR-V execution environment describes required support for some SPIR-V capabilities, additional semantics for some SPIR-V instructions, and additional validation rules that a SPIR-V binary module must adhere to in order to be considered valid.
This document is written for compiler developers who are generating SPIR-V modules intended to be consumed by the ‘oneAPI’ Level-Zero API, for implementors of the ‘oneAPI’ Level-Zero API, and for software developers who are using SPIR-V modules with the ‘oneAPI’ Level-Zero API.
Common Properties¶
This section describes common properties of all ‘oneAPI’ Level-Zero environments that consume SPIR-V modules.
A SPIR-V module is interpreted as a series of 32-bit words in host endianness, with literal strings packed as described in the SPIR-V specification. The first few words of the SPIR-V module must be a magic number and a SPIR-V version number, as described in the SPIR-V specification.
Supported SPIR-V Versions¶
The maximum SPIR-V version supported by a device is described by ze_device_module_properties_t.spirvVersionSupported.
Extended Instruction Sets¶
The OpenCL.std extended instruction set for OpenCL is supported.
Source Language Encoding¶
The source language version is purely informational and has no semantic meaning.
Numerical Type Formats¶
Floating-point types are represented and stored using IEEE-754 semantics. All integer formats are represented and stored using 2’s-complement format.
Supported Types¶
The following types are supported. Note that some types may require additional capabilities, and may not be supported by all environments.
Basic Scalar and Vector Types¶
OpTypeVoid is supported.
The following scalar types are supported:
OpTypeBool
OpTypeInt, with Width equal to 8, 16, 32, or 64, and with Signedness equal to zero, indicating no signedness semantics.
OpTypeFloat, with Width equal to 16, 32, or 64.
OpTypeVector vector types are supported. The vector Component Type may be any of the scalar types described above. Supported vector Component Counts are 2, 3, 4, 8, or 16.
OpTypeArray array types are supported, OpTypeStruct struct types are supported, OpTypeFunction functions are supported, and OpTypePointer pointer types are supported.
Kernels¶
An OpFunction in a SPIR-V module that is identified with OpEntryPoint defines a kernel that may be launched using host API interfaces.
Kernel Return Types¶
The Result Type for an OpFunction identified with OpEntryPoint must be OpTypeVoid.
Kernel Arguments¶
An OpFunctionParameter for an OpFunction that is identified with OpEntryPoint defines a kernel argument. Allowed types for kernel arguments are:
OpTypeInt
OpTypeFloat
OpTypeStruct
OpTypeVector
OpTypePointer
OpTypeSampler
OpTypeImage
For OpTypeInt parameters, supported Widths are 8, 16, 32, and 64, and must have no signedness semantics.
For OpTypeFloat parameters, supported Widths are 16 and 32.
For OpTypeStruct parameters, supported structure Member Types are:
OpTypeInt
OpTypeFloat
OpTypeStruct
OpTypeVector
OpTypePointer
For OpTypePointer parameters, supported Storage Classes are:
CrossWorkgroup
Workgroup
UniformConstant
Environments that support extensions or optional features may allow additional types in an entry point’s parameter list.
Required Capabilities¶
SPIR-V 1.0¶
An environment that supports SPIR-V 1.0 must support SPIR-V 1.0 modules that declare the following capabilities:
Addresses
Float16Buffer
Int64
Int16
Int8
Kernel
Linkage
Vector16
GenericPointer
Groups
ImageBasic (for devices supporting ze_device_image_properties_t.supported)
Float16 (for devices supporting ZE_DEVICE_MODULE_FLAG_FP16)
Float64 (for devices supporting ZE_DEVICE_MODULE_FLAG_FP64)
Int64Atomics (for devices supporting ZE_DEVICE_MODULE_FLAG_INT64_ATOMICS)
If the ‘oneAPI’ environment supports the ImageBasic capability, then the following capabilities must also be supported:
LiteralSampler
Sampled1D
Image1D
SampledBuffer
ImageBuffer
ImageReadWrite
SPIR-V 1.1¶
An environment supporting SPIR-V 1.1 must support SPIR-V 1.1 modules that declare the capabilities required for SPIR-V 1.0 modules, above.
SPIR-V 1.1 does not add any new required capabilities.
SPIR-V 1.2¶
An environment supporting SPIR-V 1.2 must support SPIR-V 1.2 modules that declare the capabilities required for SPIR-V 1.1 modules, above.
SPIR-V 1.2 does not add any new required capabilities.
Validation Rules¶
The following are a list of validation rules that apply to SPIR-V modules executing in all ‘oneAPI’ Level-Zero environments:
The Execution Model declared in OpEntryPoint must be Kernel.
The Addressing Model declared in OpMemoryModel must Physical64, indicating that device pointers are 64-bits.
The Memory Model declared in OpMemoryModel must be OpenCL.
For all OpTypeInt integer type-declaration instructions:
Signedness must be 0, indicating no signedness semantics.
For all OpTypeImage type-declaration instructions: * Sampled Type must be OpTypeVoid. * Sampled must be 0, indicating that the image usage will be known at run time, not at compile time. * MS must be 0, indicating single-sampled content. * Arrayed may only be set to 1, indicating arrayed content, when Dim is set to 1D or 2D. * Image Format must be Unknown, indicating that the image does not have a specified format. * The optional image Access Qualifier must be present.
The image write instruction OpImageWrite must not include any optional Image Operands.
The image read instructions OpImageRead and OpImageSampleExplicitLod must not include the optional Image Operand ConstOffset.
For all Atomic Instructions:
32-bit integer types are supported for the Result Type and/or type of Value. 64-bit integer types are optionally supported for the Result Type and/or type of Value for devices supporting ZE_DEVICE_MODULE_FLAG_INT64_ATOMICS.
The Pointer operand must be a pointer to the Function, Workgroup, CrossWorkGroup, or Generic Storage Classes.
Recursion is not supported. The static function call graph for an entry point must not contain cycles.
Whether irreducible control flow is legal is implementation defined.
For the instructions OpGroupAsyncCopy and OpGroupWaitEvents, Scope for Execution must be:
Workgroup
For all other instructions, Scope for Execution must be one of:
Workgroup
Subgroup
Scope for Memory must be one of:
CrossDevice
Device
Workgroup
Invocation
Subgroup
Extensions¶
SPV_INTEL_subgroups
¶
‘oneAPI’ Level-Zero API environments must accept SPIR-V modules that
declare use of the SPV_INTEL_subgroups
extension via
OpExtension.
When use of the SPV_INTEL_subgroups
extension is declared in the
module via OpExtension, the environment must accept modules that
declare the following SPIR-V capabilities:
SubgroupShuffleINTEL
SubgroupBufferBlockIOINTEL
SubgroupImageBlockIOINTEL
The environment must accept the following types for Data for the SubgroupShuffleINTEL instructions:
Scalars and OpTypeVectors with 2, 4, 8, or 16 Component Count components of the following Component Type types:
OpTypeFloat with a Width of 32 bits (
float
)TBD: char types?
OpTypeInt with a Width of 16 bits and Signedness of 0 (
short
andushort
)OpTypeInt with a Width of 32 bits and Signedness of 0 (
int
anduint
)
Scalars of OpTypeInt with a Width of 64 bits and Signedness of 0 (
long
andulong
)TBD: vectors of long types?
Additionally, if the Float16 capability is declared and supported:
Scalars of OpTypeFloat with a Width of 16 bits (
half
)
Additionally, if the Float64 capability is declared and supported:
Scalars of OpTypeFloat with a Width of 64 bits (
double
)
The environment must accept the following types for Result and Data for the SubgroupBufferBlockIOINTEL and SubgroupImageBlockIOINTEL instructions:
Scalars and OpTypeVectors with 2, 4, or 8 Component Count components of the following Component Type types:
OpTypeInt with a Width of 32 bits and Signedness of 0 (
int
anduint
)OpTypeInt with a Width of 16 bits and Signedness of 0 (
short
andushort
)
For Ptr, valid Storage Classes are:
CrossWorkGroup (
global
)
For Image:
Dim must be 2D
Depth must be 0 (not a depth image)
Arrayed must be 0 (non-arrayed content)
MS must be 0 (single-sampled content)
For Coordinate, the following types are supported:
OpTypeVectors with two Component Count components of Component Type OpTypeInt with a Width of 32 bits and Signedness of 0 (
int2
)
Notes and Restrictions¶
The SubgroupShuffleINTEL instructions may be placed within non-uniform control flow and hence do not have to be encountered by all invocations in the subgroup, however Data may only be shuffled among invocations encountering the SubgroupShuffleINTEL instruction. Shuffling Data from an invocation that does not encounter the SubgroupShuffleINTEL instruction will produce undefined results.
There is no defined behavior for out-of-range shuffle indices for the SubgroupShuffleINTEL instructions.
The SubgroupBufferBlockIOINTEL and SubgroupImageBlockIOINTEL instructions are only guaranteed to work correctly if placed strictly within uniform control flow within the subgroup. This ensures that if any invocation executes it, all invocations will execute it. If placed elsewhere, behavior is undefined.
There is no defined out-of-range behavior for the SubgroupBufferBlockIOINTEL instructions.
The SubgroupImageBlockIOINTEL instructions do support bounds
checking, however they bounds-check to the image width in units of
uints
, not in units of image elements. This means:
If the image has an Image Format size equal to the size of a
uint
(four bytes, for example Rgba8), the image will be correctly bounds-checked. In this case, out-of-bounds reads will return the edge image element (the equivalent of ClampToEdge), and out-of-bounds writes will be ignored.If the image has an Image Format size less than the size of a
uint
(such as R8), the entire image is addressable, however bounds checking will occur too late. For this reason, extra care should be taken to avoid out-of-bounds reads and writes, since out-of-bounds reads may return invalid data and out-of-bounds writes may corrupt other images or buffers unpredictably.
The following restrictions apply to the SubgroupBufferBlockIOINTEL instructions:
The pointer Ptr must be 32-bit (4-byte) aligned for reads, and must be 128-bit (16-byte) aligned for writes.
The following restrictions apply to the SubgroupImageBlockIOINTEL instructions:
The behavior of the SubgroupImageBlockIOINTEL instructions is undefined for images with an element size greater than four bytes (such as Rgba32f).
The following restrictions apply to the OpSubgroupImageBlockWriteINTEL instruction:
Unlike the image block read instruction, which may read from any arbitrary byte offset, the x-component of the byte coordinate for the image block write instruction must be a multiple of four; in other words, the write must begin at a 32-bit boundary. There is no restriction on the y-component of the coordinate.
Other Extensions to Consider:¶
Numerical Compliance¶
The ‘oneAPI’ Level-Zero environment will meet or exceed the numerical compliance requirements defined in the OpenCL SPIR-V Environment Specification. See: Numerical Compliance.
Image Addressing and Filtering¶
The ‘oneAPI’ Level-Zero environment image addressing and filtering behavior is compatible with the behavior defined in the OpenCL SPIR-V Environment Specification. See: Image Addressing and Filtering.