A set of functions that aid in oneDNN debugging and profiling. More...

Classes
struct	dnnl_version_t
	Structure containing version information as per Semantic Versioning More...

Macros
#define	DNNL_RUNTIME_NONE 0u
	No runtime (disabled)

#define	DNNL_RUNTIME_SEQ 1u
	Sequential runtime (CPU only)

#define	DNNL_RUNTIME_OMP 2u
	OpenMP runtime (CPU only)

#define	DNNL_RUNTIME_TBB 4u
	TBB runtime (CPU only)

#define	DNNL_RUNTIME_THREADPOOL 8u
	Threadpool runtime (CPU only)

#define	DNNL_RUNTIME_OCL 256u
	OpenCL runtime.

#define	DNNL_RUNTIME_SYCL 512u
	SYCL runtime.

#define	DNNL_RUNTIME_DPCPP DNNL_RUNTIME_SYCL
	DPC++ runtime.

#define	DNNL_JIT_PROFILE_NONE 0u
	Disable profiling completely.

#define	DNNL_JIT_PROFILE_VTUNE 1u
	Enable VTune Amplifier integration.

#define	DNNL_JIT_PROFILE_LINUX_PERFMAP 2u
	Enable Linux perf integration via perfmap files.

#define	DNNL_JIT_PROFILE_LINUX_JITDUMP 4u
	Enable Linux perf integration via jitdump files.

#define	DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC 8u
	Instruct Linux perf integration via jitdump files to use TSC. More...

#define	DNNL_JIT_PROFILE_LINUX_PERF (DNNL_JIT_PROFILE_LINUX_JITDUMP \| DNNL_JIT_PROFILE_LINUX_PERFMAP)
	Enable Linux perf integration (both jitdump and perfmap)

Typedefs
using	dnnl::version_t = dnnl_version_t
	Structure containing version information as per Semantic Versioning More...

Enumerations
enum	dnnl::status
	Status values returned by the library functions. More...

enum	dnnl::cpu_isa
	CPU instruction set flags. More...

enum	dnnl_cpu_isa_t
	CPU instruction set flags. More...

Functions
dnnl_status_t DNNL_API	dnnl_set_verbose (int level)
	Configures verbose output to stdout. More...

dnnl_status_t DNNL_API	dnnl_set_jit_dump (int enable)
	Configures dumping of JIT-generated code. More...

const dnnl_version_t DNNL_API *	dnnl_version (void)
	Returns library version information. More...

dnnl_status_t DNNL_API	dnnl_set_jit_profiling_flags (unsigned flags)
	Sets library profiling flags. More...

dnnl_status_t DNNL_API	dnnl_set_jit_profiling_jitdumpdir (const char *dir)
	Sets JIT dump output path. More...

dnnl_status_t DNNL_API	dnnl_set_max_cpu_isa (dnnl_cpu_isa_t isa)
	Sets the maximal ISA the library can dispatch to on the CPU. More...

dnnl_cpu_isa_t DNNL_API	dnnl_get_effective_cpu_isa (void)
	Gets the maximal ISA the library can dispatch to on the CPU. More...

status	dnnl::set_verbose (int level)
	Configures verbose output to stdout. More...

const version_t *	dnnl::version ()
	Returns library version information. More...

status	dnnl::set_jit_dump (int enable)
	Configures dumping of JIT-generated code. More...

status	dnnl::set_jit_profiling_flags (unsigned flags)
	Sets library profiling flags. More...

status	dnnl::set_jit_profiling_jitdumpdir (const std::string &dir)
	Sets JIT dump output path. More...

status	dnnl::set_max_cpu_isa (cpu_isa isa)
	Sets the maximal ISA the library can dispatch to on the CPU. More...

cpu_isa	dnnl::get_effective_cpu_isa ()
	Gets the maximal ISA the library can dispatch to on the CPU. More...

Detailed Description

A set of functions that aid in oneDNN debugging and profiling.

Macro Definition Documentation

◆ DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC

#define DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC 8u

Instruct Linux perf integration via jitdump files to use TSC.

DNNL_JIT_PROFILE_LINUX_JITDUMP must be set too for this to take effect.

Typedef Documentation

◆ version_t

using dnnl::version_t = typedef dnnl_version_t

Structure containing version information as per Semantic Versioning

Enumeration Type Documentation

◆ status

enum dnnl::status

strong

Status values returned by the library functions.

Enumerator
success	The operation was successful.
out_of_memory	The operation failed due to an out-of-memory condition.
invalid_arguments	The operation failed because of incorrect function arguments.
unimplemented	The operation failed because requested functionality is not implemented.
iterator_ends	Primitive iterator passed over last primitive descriptor.
runtime_error	Primitive or engine failed on execution.
not_required	Queried element is not required for given primitive.

◆ cpu_isa

enum dnnl::cpu_isa

strong

CPU instruction set flags.

Enumerator
all	Any ISA (excepting those listed as initial support)
sse41	Intel Streaming SIMD Extensions 4.1 (Intel SSE4.1)
avx	Intel Advanced Vector Extensions (Intel AVX)
avx2	Intel Advanced Vector Extensions 2 (Intel AVX2)
avx512_mic	Intel Advanced Vector Extensions 512 (Intel AVX-512) subset for Intel Xeon Phi processors x200 Series.
avx512_mic_4ops	Intel AVX-512 subset for Intel Xeon Phi processors 7235, 7285, 7295 Series.
avx512_core	Intel AVX-512 subset for Intel Xeon Scalable processor family and Intel Core processor family.
avx512_core_vnni	Intel AVX-512 and Intel Deep Learning Boost (Intel DL Boost) support for Intel Xeon Scalable processor family and Intel Core processor family.
avx512_core_bf16	Intel AVX-512, Intel DL Boost and bfloat16 support for Intel Xeon Scalable processor family and Intel Core processor family.
avx512_core_amx	Intel AVX-512, Intel DL Boost and bfloat16 support and Intel AMX with 8-bit integer and bfloat16 support (initial support)
avx2_vnni	Intel AVX2 and Intel Deep Learning Boost (Intel DL Boost) support.

◆ dnnl_cpu_isa_t

enum dnnl_cpu_isa_t

CPU instruction set flags.

Enumerator
dnnl_cpu_isa_all	Any ISA (excepting those listed as initial support)
dnnl_cpu_isa_sse41	Intel Streaming SIMD Extensions 4.1 (Intel SSE4.1)
dnnl_cpu_isa_avx	Intel Advanced Vector Extensions (Intel AVX)
dnnl_cpu_isa_avx2	Intel Advanced Vector Extensions 2 (Intel AVX2)
dnnl_cpu_isa_avx512_mic	Intel Advanced Vector Extensions 512 (Intel AVX-512) subset for Intel Xeon Phi processors x200 Series.
dnnl_cpu_isa_avx512_mic_4ops	Intel AVX-512 subset for Intel Xeon Phi processors 7235, 7285, 7295 Series.
dnnl_cpu_isa_avx512_core	Intel AVX-512 subset for Intel Xeon Scalable processor family and Intel Core processor family.
dnnl_cpu_isa_avx512_core_vnni	Intel AVX-512 and Intel Deep Learning Boost (Intel DL Boost) support for Intel Xeon Scalable processor family and Intel Core processor family.
dnnl_cpu_isa_avx512_core_bf16	Intel AVX-512, Intel DL Boost and bfloat16 support for Intel Xeon Scalable processor family and Intel Core processor family.
dnnl_cpu_isa_avx512_core_amx	Intel AVX-512, Intel DL Boost and bfloat16 support and Intel AMX with 8-bit integer and bfloat16 support (initial support)
dnnl_cpu_isa_avx2_vnni	Intel AVX2 and Intel Deep Learning Boost (Intel DL Boost) support.

Function Documentation

◆ dnnl_set_verbose()

dnnl_status_t DNNL_API dnnl_set_verbose ( int level )

Configures verbose output to stdout.

Note: Enabling verbose output affects performance. This setting overrides the DNNL_VERBOSE environment variable.

Parameters

level

Verbosity level:

0: no verbose output (default),
1: primitive information at execution,
2: primitive information at creation and execution.

Returns: dnnl_invalid_arguments/dnnl::status::invalid_arguments if the level value is invalid, and dnnl_success/dnnl::status::success on success.

◆ dnnl_set_jit_dump()

dnnl_status_t DNNL_API dnnl_set_jit_dump ( int enable )

Configures dumping of JIT-generated code.

Note: This setting overrides the DNNL_JIT_DUMP environment variable.

Parameters

enable Flag value. Set to 0 to disable and set to 1 to enable.

Returns: dnnl_invalid_arguments/dnnl::status::invalid_arguments if the flag value is invalid, and dnnl_success/dnnl::status::success on success.

◆ dnnl_version()

const dnnl_version_t DNNL_API* dnnl_version ( void )

Returns library version information.

Returns

Pointer to a constant structure containing

major: major version number,
minor: minor version number,
patch: patch release number,
hash: git commit hash.

◆ dnnl_set_jit_profiling_flags()

dnnl_status_t DNNL_API dnnl_set_jit_profiling_flags ( unsigned flags )

Sets library profiling flags.

The flags define which profilers are supported.

Note: This setting overrides DNNL_JIT_PROFILE environment variable.

See also: Profiling oneDNN Performance

Parameters

flags

Profiling flags that can contain the following bits:

DNNL_JIT_PROFILE_VTUNE – integration with VTune Amplifier (on by default)
DNNL_JIT_PROFILE_LINUX_JITDUMP – produce Linux-specific jit-pid.dump output (off by default). The location of the output is controlled via JITDUMPDIR environment variable or via dnnl_set_jit_profiling_jitdumpdir() function.
DNNL_JIT_PROFILE_LINUX_PERFMAP – produce Linux-specific perf-pid.map output (off by default). The output is always placed into /tmp.

Passing DNNL_JIT_PROFILE_NONE disables profiling completely.

Returns: dnnl_invalid_arguments/dnnl::status::invalid_arguments if the flags value is invalid, and dnnl_success/dnnl::status::success on success.

◆ dnnl_set_jit_profiling_jitdumpdir()

dnnl_status_t DNNL_API dnnl_set_jit_profiling_jitdumpdir ( const char * dir )

Sets JIT dump output path.

Only applicable to Linux and is only used when profiling flags have DNNL_JIT_PROFILE_LINUX_PERF bit set.

After the first JIT kernel is generated, the jitdump output will be placed into temporary directory created using the mkdtemp template 'dir/.debug/jit/dnnl.XXXXXX'.

See also: Profiling oneDNN Performance

Note: This setting overrides JITDUMPDIR environment variable. If JITDUMPDIR is not set, and this function is never called, the path defaults to HOME. Passing NULL reverts the value to default.; The directory is accessed only when the first JIT kernel is being created. JIT profiling will be disabled in case of any errors accessing or creating this directory.

Parameters

dir	JIT dump output path.

Returns: dnnl_success/dnnl::status::success if the output directory was set correctly and an error status otherwise.; dnnl_unimplemented/dnnl::status::unimplemented on Windows.

◆ dnnl_set_max_cpu_isa()

dnnl_status_t DNNL_API dnnl_set_max_cpu_isa ( dnnl_cpu_isa_t isa )

Sets the maximal ISA the library can dispatch to on the CPU.

See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values accepted by the C and C++ API functions respectively.

This function has effect only before the first JIT kernel is generated and will return an error afterwards.

This function overrides the DNNL_MAX_CPU_ISA environment variable. The environment variable can be set to the desired maximal ISA name in upper case and with dnnl_cpu_isa prefix removed. For example: DNNL_MAX_CPU_ISA=AVX2.

Note

The ISAs are only partially ordered:

SSE41 < AVX < AVX2,
AVX2 < AVX512_MIC < AVX512_MIC_4OPS,
AVX2 < AVX512_CORE < AVX512_CORE_VNNI < AVX512_CORE_BF16 < AVX512_CORE_AMX,
AVX2 < AVX2_VNNI.

See also: CPU Dispatcher Control for more details

Parameters

isa	Maximal ISA the library should dispatch to. Pass dnnl_cpu_isa_all/dnnl::cpu_isa::all to remove ISA restrictions (except for ISAs with initial support in the library).

Returns: dnnl_success/dnnl::status::success on success and a dnnl_invalid_arguments/dnnl::status::invalid_arguments if the isa parameter is invalid or the ISA cannot be changed at this time.; dnnl_unimplemented/dnnl::status::unimplemented if the feature was disabled at build time (see Build Options for more details).

◆ dnnl_get_effective_cpu_isa()

dnnl_cpu_isa_t DNNL_API dnnl_get_effective_cpu_isa ( void )

Gets the maximal ISA the library can dispatch to on the CPU.

See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values returned by the C and C++ API functions respectively.

See also: CPU Dispatcher Control for more details

Returns: dnnl_cpu_isa_t value reflecting the maximal ISA the library may dispatch to.

◆ set_verbose()

status dnnl::set_verbose ( int level )

inline

Configures verbose output to stdout.

Note: Enabling verbose output affects performance. This setting overrides the DNNL_VERBOSE environment variable.

Parameters

level

Verbosity level:

0: no verbose output (default),
1: primitive information at execution,
2: primitive information at creation and execution.

Returns: dnnl_invalid_arguments/dnnl::status::invalid_arguments if the level value is invalid, and dnnl_success/dnnl::status::success on success.

◆ version()

const version_t* dnnl::version ( )

inline

Returns library version information.

Returns

Pointer to a constant structure containing

major: major version number,
minor: minor version number,
patch: patch release number,
hash: git commit hash.

◆ set_jit_dump()

status dnnl::set_jit_dump ( int enable )

inline

Configures dumping of JIT-generated code.

Note: This setting overrides the DNNL_JIT_DUMP environment variable.

Parameters

enable Flag value. Set to 0 to disable and set to 1 to enable.

Returns: dnnl_invalid_arguments/dnnl::status::invalid_arguments if the flag value is invalid, and dnnl_success/dnnl::status::success on success.

◆ set_jit_profiling_flags()

status dnnl::set_jit_profiling_flags ( unsigned flags )

inline

Sets library profiling flags.

The flags define which profilers are supported.

Note: This setting overrides DNNL_JIT_PROFILE environment variable.

See also: Profiling oneDNN Performance

Parameters

flags

Profiling flags that can contain the following bits:

DNNL_JIT_PROFILE_VTUNE – integration with VTune Amplifier (on by default)
DNNL_JIT_PROFILE_LINUX_JITDUMP – produce Linux-specific jit-pid.dump output (off by default). The location of the output is controlled via JITDUMPDIR environment variable or via dnnl_set_jit_profiling_jitdumpdir() function.
DNNL_JIT_PROFILE_LINUX_PERFMAP – produce Linux-specific perf-pid.map output (off by default). The output is always placed into /tmp.

Passing DNNL_JIT_PROFILE_NONE disables profiling completely.

Returns: dnnl_invalid_arguments/dnnl::status::invalid_arguments if the flags value is invalid, and dnnl_success/dnnl::status::success on success.

◆ set_jit_profiling_jitdumpdir()

status dnnl::set_jit_profiling_jitdumpdir ( const std::string & dir )

inline

Sets JIT dump output path.

Only applicable to Linux and is only used when profiling flags have DNNL_JIT_PROFILE_LINUX_PERF bit set.

After the first JIT kernel is generated, the jitdump output will be placed into temporary directory created using the mkdtemp template 'dir/.debug/jit/dnnl.XXXXXX'.

See also: Profiling oneDNN Performance

Note: This setting overrides JITDUMPDIR environment variable. If JITDUMPDIR is not set, and this function is never called, the path defaults to HOME. Passing NULL reverts the value to default.; The directory is accessed only when the first JIT kernel is being created. JIT profiling will be disabled in case of any errors accessing or creating this directory.

Parameters

dir	JIT dump output path.

Returns: dnnl_success/dnnl::status::success if the output directory was set correctly and an error status otherwise.; dnnl_unimplemented/dnnl::status::unimplemented on Windows.

◆ set_max_cpu_isa()

status dnnl::set_max_cpu_isa ( cpu_isa isa )

inline

Sets the maximal ISA the library can dispatch to on the CPU.

See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values accepted by the C and C++ API functions respectively.

This function has effect only before the first JIT kernel is generated and will return an error afterwards.

This function overrides the DNNL_MAX_CPU_ISA environment variable. The environment variable can be set to the desired maximal ISA name in upper case and with dnnl_cpu_isa prefix removed. For example: DNNL_MAX_CPU_ISA=AVX2.

Note

The ISAs are only partially ordered:

SSE41 < AVX < AVX2,
AVX2 < AVX512_MIC < AVX512_MIC_4OPS,
AVX2 < AVX512_CORE < AVX512_CORE_VNNI < AVX512_CORE_BF16 < AVX512_CORE_AMX,
AVX2 < AVX2_VNNI.

See also: CPU Dispatcher Control for more details

Parameters

isa	Maximal ISA the library should dispatch to. Pass dnnl_cpu_isa_all/dnnl::cpu_isa::all to remove ISA restrictions (except for ISAs with initial support in the library).

Returns: dnnl_success/dnnl::status::success on success and a dnnl_invalid_arguments/dnnl::status::invalid_arguments if the isa parameter is invalid or the ISA cannot be changed at this time.; dnnl_unimplemented/dnnl::status::unimplemented if the feature was disabled at build time (see Build Options for more details).

◆ get_effective_cpu_isa()

cpu_isa dnnl::get_effective_cpu_isa ( )

inline

Gets the maximal ISA the library can dispatch to on the CPU.

See dnnl_cpu_isa_t and dnnl::cpu_isa for the list of the values returned by the C and C++ API functions respectively.

See also: CPU Dispatcher Control for more details

Returns: dnnl_cpu_isa_t value reflecting the maximal ISA the library may dispatch to.

Classes

Macros

Typedefs

Enumerations

Functions

Detailed Description

Macro Definition Documentation

◆ DNNL_JIT_PROFILE_LINUX_JITDUMP_USE_TSC

Typedef Documentation

◆ version_t

Enumeration Type Documentation

◆ status

◆ cpu_isa

◆ dnnl_cpu_isa_t

Function Documentation

◆ dnnl_set_verbose()

◆ dnnl_set_jit_dump()

◆ dnnl_version()

◆ dnnl_set_jit_profiling_flags()

◆ dnnl_set_jit_profiling_jitdumpdir()

◆ dnnl_set_max_cpu_isa()

◆ dnnl_get_effective_cpu_isa()

◆ set_verbose()

◆ version()

◆ set_jit_dump()

◆ set_jit_profiling_flags()

◆ set_jit_profiling_jitdumpdir()

◆ set_max_cpu_isa()

◆ get_effective_cpu_isa()