A set of functions that aid in oneDNN debugging and profiling.
More...
A set of functions that aid in oneDNN debugging and profiling.
◆ DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC
#define DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC 8u |
◆ version_t
◆ status
Status values returned by the library functions.
Enumerator |
---|
success | The operation was successful.
|
out_of_memory | The operation failed due to an out-of-memory condition.
|
invalid_arguments | The operation failed because of incorrect function arguments.
|
unimplemented | The operation failed because requested functionality is not implemented.
|
iterator_ends | Primitive iterator passed over last primitive descriptor.
|
runtime_error | Primitive or engine failed on execution.
|
not_required | Queried element is not required for given primitive.
|
◆ cpu_isa
CPU instruction set flags.
Enumerator |
---|
all | Any ISA (excepting those listed as initial support)
|
sse41 | Intel Streaming SIMD Extensions 4.1 (Intel SSE4.1)
|
avx | Intel Advanced Vector Extensions (Intel AVX)
|
avx2 | Intel Advanced Vector Extensions 2 (Intel AVX2)
|
avx512_mic | Intel Advanced Vector Extensions 512 (Intel AVX-512) subset for Intel Xeon Phi processors x200 Series.
|
avx512_mic_4ops | Intel AVX-512 subset for Intel Xeon Phi processors 7235, 7285, 7295 Series.
|
avx512_core | Intel AVX-512 subset for Intel Xeon Scalable processor family and Intel Core processor family.
|
avx512_core_vnni | Intel AVX-512 and Intel Deep Learning Boost (Intel DL Boost) support for Intel Xeon Scalable processor family and Intel Core processor family.
|
avx512_core_bf16 | Intel AVX-512, Intel DL Boost and bfloat16 support for Intel Xeon Scalable processor family and Intel Core processor family.
|
avx512_core_amx | Intel AVX-512, Intel DL Boost and bfloat16 support and Intel AMX with 8-bit integer and bfloat16 support (initial support)
|
avx2_vnni | Intel AVX2 and Intel Deep Learning Boost (Intel DL Boost) support.
|
◆ dnnl_cpu_isa_t
CPU instruction set flags.
Enumerator |
---|
dnnl_cpu_isa_all | Any ISA (excepting those listed as initial support)
|
dnnl_cpu_isa_sse41 | Intel Streaming SIMD Extensions 4.1 (Intel SSE4.1)
|
dnnl_cpu_isa_avx | Intel Advanced Vector Extensions (Intel AVX)
|
dnnl_cpu_isa_avx2 | Intel Advanced Vector Extensions 2 (Intel AVX2)
|
dnnl_cpu_isa_avx512_mic | Intel Advanced Vector Extensions 512 (Intel AVX-512) subset for Intel Xeon Phi processors x200 Series.
|
dnnl_cpu_isa_avx512_mic_4ops | Intel AVX-512 subset for Intel Xeon Phi processors 7235, 7285, 7295 Series.
|
dnnl_cpu_isa_avx512_core | Intel AVX-512 subset for Intel Xeon Scalable processor family and Intel Core processor family.
|
dnnl_cpu_isa_avx512_core_vnni | Intel AVX-512 and Intel Deep Learning Boost (Intel DL Boost) support for Intel Xeon Scalable processor family and Intel Core processor family.
|
dnnl_cpu_isa_avx512_core_bf16 | Intel AVX-512, Intel DL Boost and bfloat16 support for Intel Xeon Scalable processor family and Intel Core processor family.
|
dnnl_cpu_isa_avx512_core_amx | Intel AVX-512, Intel DL Boost and bfloat16 support and Intel AMX with 8-bit integer and bfloat16 support (initial support)
|
dnnl_cpu_isa_avx2_vnni | Intel AVX2 and Intel Deep Learning Boost (Intel DL Boost) support.
|
◆ dnnl_set_verbose()
Configures verbose output to stdout.
- Note
- Enabling verbose output affects performance. This setting overrides the DNNL_VERBOSE environment variable.
- Parameters
-
level | Verbosity level:
- 0: no verbose output (default),
- 1: primitive information at execution,
- 2: primitive information at creation and execution.
|
- Returns
- dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
level
value is invalid, and dnnl_success/dnnl::status::success on success.
◆ dnnl_set_jit_dump()
◆ dnnl_version()
Returns library version information.
- Returns
- Pointer to a constant structure containing
- major: major version number,
- minor: minor version number,
- patch: patch release number,
- hash: git commit hash.
◆ dnnl_set_jit_profiling_flags()
dnnl_status_t DNNL_API dnnl_set_jit_profiling_flags |
( |
unsigned |
flags | ) |
|
◆ dnnl_set_jit_profiling_jitdumpdir()
dnnl_status_t DNNL_API dnnl_set_jit_profiling_jitdumpdir |
( |
const char * |
dir | ) |
|
Sets JIT dump output path.
Only applicable to Linux and is only used when profiling flags have DNNL_JIT_PROFILE_LINUX_PERF bit set.
After the first JIT kernel is generated, the jitdump output will be placed into temporary directory created using the mkdtemp template 'dir/.debug/jit/dnnl.XXXXXX'.
- See also
- Profiling oneDNN Performance
- Note
- This setting overrides JITDUMPDIR environment variable. If JITDUMPDIR is not set, and this function is never called, the path defaults to HOME. Passing NULL reverts the value to default.
-
The directory is accessed only when the first JIT kernel is being created. JIT profiling will be disabled in case of any errors accessing or creating this directory.
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success if the output directory was set correctly and an error status otherwise.
-
dnnl_unimplemented/dnnl::status::unimplemented on Windows.
◆ dnnl_set_max_cpu_isa()
Sets the maximal ISA the library can dispatch to on the CPU.
See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values accepted by the C and C++ API functions respectively.
This function has effect only before the first JIT kernel is generated and will return an error afterwards.
This function overrides the DNNL_MAX_CPU_ISA environment variable. The environment variable can be set to the desired maximal ISA name in upper case and with dnnl_cpu_isa prefix removed. For example: DNNL_MAX_CPU_ISA=AVX2
.
- Note
- The ISAs are only partially ordered:
- SSE41 < AVX < AVX2,
- AVX2 < AVX512_MIC < AVX512_MIC_4OPS,
- AVX2 < AVX512_CORE < AVX512_CORE_VNNI < AVX512_CORE_BF16 < AVX512_CORE_AMX,
- AVX2 < AVX2_VNNI.
- See also
- CPU Dispatcher Control for more details
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success on success and a dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
isa
parameter is invalid or the ISA cannot be changed at this time.
-
dnnl_unimplemented/dnnl::status::unimplemented if the feature was disabled at build time (see Build Options for more details).
◆ dnnl_get_effective_cpu_isa()
◆ set_verbose()
status dnnl::set_verbose |
( |
int |
level | ) |
|
|
inline |
Configures verbose output to stdout.
- Note
- Enabling verbose output affects performance. This setting overrides the DNNL_VERBOSE environment variable.
- Parameters
-
level | Verbosity level:
- 0: no verbose output (default),
- 1: primitive information at execution,
- 2: primitive information at creation and execution.
|
- Returns
- dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
level
value is invalid, and dnnl_success/dnnl::status::success on success.
◆ version()
Returns library version information.
- Returns
- Pointer to a constant structure containing
- major: major version number,
- minor: minor version number,
- patch: patch release number,
- hash: git commit hash.
◆ set_jit_dump()
status dnnl::set_jit_dump |
( |
int |
enable | ) |
|
|
inline |
◆ set_jit_profiling_flags()
status dnnl::set_jit_profiling_flags |
( |
unsigned |
flags | ) |
|
|
inline |
◆ set_jit_profiling_jitdumpdir()
status dnnl::set_jit_profiling_jitdumpdir |
( |
const std::string & |
dir | ) |
|
|
inline |
Sets JIT dump output path.
Only applicable to Linux and is only used when profiling flags have DNNL_JIT_PROFILE_LINUX_PERF bit set.
After the first JIT kernel is generated, the jitdump output will be placed into temporary directory created using the mkdtemp template 'dir/.debug/jit/dnnl.XXXXXX'.
- See also
- Profiling oneDNN Performance
- Note
- This setting overrides JITDUMPDIR environment variable. If JITDUMPDIR is not set, and this function is never called, the path defaults to HOME. Passing NULL reverts the value to default.
-
The directory is accessed only when the first JIT kernel is being created. JIT profiling will be disabled in case of any errors accessing or creating this directory.
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success if the output directory was set correctly and an error status otherwise.
-
dnnl_unimplemented/dnnl::status::unimplemented on Windows.
◆ set_max_cpu_isa()
Sets the maximal ISA the library can dispatch to on the CPU.
See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values accepted by the C and C++ API functions respectively.
This function has effect only before the first JIT kernel is generated and will return an error afterwards.
This function overrides the DNNL_MAX_CPU_ISA environment variable. The environment variable can be set to the desired maximal ISA name in upper case and with dnnl_cpu_isa prefix removed. For example: DNNL_MAX_CPU_ISA=AVX2
.
- Note
- The ISAs are only partially ordered:
- SSE41 < AVX < AVX2,
- AVX2 < AVX512_MIC < AVX512_MIC_4OPS,
- AVX2 < AVX512_CORE < AVX512_CORE_VNNI < AVX512_CORE_BF16 < AVX512_CORE_AMX,
- AVX2 < AVX2_VNNI.
- See also
- CPU Dispatcher Control for more details
- Parameters
-
- Returns
- dnnl_success/dnnl::status::success on success and a dnnl_invalid_arguments/dnnl::status::invalid_arguments if the
isa
parameter is invalid or the ISA cannot be changed at this time.
-
dnnl_unimplemented/dnnl::status::unimplemented if the feature was disabled at build time (see Build Options for more details).
◆ get_effective_cpu_isa()
cpu_isa dnnl::get_effective_cpu_isa |
( |
| ) |
|
|
inline |