Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN)  1.0.4
Performance library for Deep Learning
Basic Concepts

Primitives

Intel(R) MKL-DNN is built around the notion of a primitive (mkldnn::primitive), a functor object that encapsulates a particular computation such as forward convolution, backward LSTM computations, or a data movement operation that changes the way data is laid out in memory. A single primitive can sometimes represent more complex, fused, computations such as a forward convolution followed by a ReLU.

The most basic difference between a primitive and a function is that a primitive can store state. For example, a convolution primitive stores operation parameters like tensor shapes and can pre-compute other secondary parameters like cache blocking. This also allows pre-generating code specifically tailored for the operation to be performed. The time it takes to perform the pre-computations can be amortized by reusing the same primitive to perform computations multiple times.

In addition to encapsulating the state, a primitive can also use a scratchpad, a temporary memory buffer, that it uses during computations. The scratchpad can be either tied to a particular primitive object (which makes that object non-thread safe), or provided as one of the parameters during execution.

Engines

Engine (mkldnn::engine) is an abstraction of a particular computational device. Currently, the only supported engine is CPU. Most primitives are created to perform computations on a particular engine. The only exception is reorder primitive, which can be used to transfer data between different engines.

Streams

Streams (mkldnn::stream) encapsulate execution context tied to a particular engine.

Memory Objects

Memory objects (mkldnn::memory) encapsulate engine-specific memory handles, tensor dimensions, data type, and memory format – the way tensor data is laid out. (Side note: Formally, primitives should also be referred to as primitive objects, but because the word 'primitive' is less overloaded than 'memory', we can omit the 'object' part without causing confusion.)

Memory objects are passed to primitives during execution via a special map that defines which tensor (source, destination or weights, or their gradients) each memory object represents.

Levels of Abstraction

Intel MKL-DNN has multiple levels abstractions for primitives and memory objects in order to expose maximum flexibility to its users.

On the logical level, the library provides the following abstractions:

Abstraction level Memory object Primitive objects
Logical description Memory descriptor Operation descriptor
Intermediate description N/A Primitive descriptor
Implementation Memory object Primitive

Creating Memory Objects and Primitives

Memory Objects

Memory objects are created from the memory descriptors. It is not possible to create a memory object from a memory descriptor that has memory format set to mkldnn::memory::format_tag::any.

There are two common ways for initializing memory descriptors:

Memory objects can be created with a user-provided handle (a void * on CPU), or without one, in which case the library will allocate storage space on its own.

Primitives

The sequence of actions to create a primitive is:

  1. Create an operation descriptor via, for example, mkldnn::convolution_forward::desc. The operation descriptor can contain memory descriptors with placeholder format_tag::any memory formats if the primitive supports it.
  2. Create a primitive descriptor based on the operation descriptor and an engine handle.
  3. Create a primitive based on the primitive descriptor obtained in step 2.