DNNL is built around the notion of a primitive (dnnl::primitive), a functor object that encapsulates a particular computation such as forward convolution, backward LSTM computations, or a data movement operation that changes the way data is laid out in memory. A single primitive can sometimes represent more complex, fused, computations such as a forward convolution followed by a ReLU.
The most basic difference between a primitive and a function is that a primitive can store state. For example, a convolution primitive stores operation parameters like tensor shapes and can pre-compute other secondary parameters like cache blocking. This also allows pre-generating code specifically tailored for the operation to be performed. The time it takes to perform the pre-computations can be amortized by reusing the same primitive to perform computations multiple times.
In addition to encapsulating the state, a primitive can also use a scratchpad, a temporary memory buffer, that it uses during computations. The scratchpad can be either tied to a particular primitive object (which makes that object non-thread safe), or provided as one of the parameters during execution.
Engine (dnnl::engine) is an abstraction of a particular computational device. Currently, the only supported engine is CPU. Most primitives are created to perform computations on a particular engine. The only exception is reorder primitive, which can be used to transfer data between different engines.
Streams (dnnl::stream) encapsulate execution context tied to a particular engine.
Memory objects (dnnl::memory) encapsulate engine-specific memory handles, tensor dimensions, data type, and memory format – the way tensor data is laid out. (Side note: Formally, primitives should also be referred to as primitive objects, but because the word 'primitive' is less overloaded than 'memory', we can omit the 'object' part without causing confusion.)
Memory objects are passed to primitives during execution via a special map that defines which tensor (source, destination or weights, or their gradients) each memory object represents.
DNNL has multiple levels of abstractions for primitives and memory objects in order to expose maximum flexibility to its users.
On the logical level, the library provides the following abstractions:
Abstraction level | Memory object | Primitive objects |
---|---|---|
Logical description | Memory descriptor | Operation descriptor |
Intermediate description | N/A | Primitive descriptor |
Implementation | Memory object | Primitive |
Memory objects are created from the memory descriptors. It is not possible to create a memory object from a memory descriptor that has memory format set to dnnl::memory::format_tag::any.
There are two common ways for initializing memory descriptors:
Memory objects can be created with a user-provided handle (a void *
on CPU), or without one, in which case the library will allocate storage space on its own.
The sequence of actions to create a primitive is: