Pooling

API Reference

General

The pooling primitive performs forward or backward max or average pooling operation on 1D, 2D, or 3D spatial data.

Forward

The pooling operation is defined by the following formulas. We show formulas only for 2D spatial data which are straightforward to generalize to cases of higher and lower dimensions. Variable names follow the standard Naming Conventions.

Max pooling:

\[\dst(n, c, oh, ow) = \max\limits_{kh, kw} \left( \src(n, c, oh \cdot SH + kh \cdot (DH + 1) - PH_L, ow \cdot SW + kw \cdot (DW + 1) - PW_L) \right)\]

Average pooling:

\[\dst(n, c, oh, ow) = \frac{1}{DENOM} \sum\limits_{kh, kw} \src(n, c, oh \cdot SH + kh \cdot (DH + 1) - PH_L, ow \cdot SW + kw \cdot (DW + 1) - PW_L)\]

Here output spatial dimensions are calculated similarly to how they are done in Convolution.

Average pooling supports two algorithms:

TODO: a picture would be nice here.

Difference Between Forward Training and Forward Inference

Backward

The backward propagation computes \(\diffsrc(n, c, h, w)\), based on \(\diffdst(n, c, h, w)\) and (in case of max pooling) workspace.

Execution Arguments

When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.

Primitive input/output

Execution argument index

\(\src\)

DNNL_ARG_SRC

\(\dst\)

DNNL_ARG_DST

workspace

DNNL_ARG_WORKSPACE

\(\diffsrc\)

DNNL_ARG_DIFF_SRC

\(\diffdst\)

DNNL_ARG_DIFF_DST

\(\text{binary post-op}\)

DNNL_ARG_ATTR_MULTIPLE_POST_OP(binary_post_op_position) | DNNL_ARG_SRC_1

Implementation Details

General Notes

  1. During training, max pooling requires a workspace on forward (dnnl_forward_training) and backward passes to save indices where a maximum was found. The workspace format is opaque, and the indices cannot be restored from it. However, one can use backward pooling to perform up-sampling (used in some detection topologies). The workspace can be created via workspace_desc() from the pooling primitive descriptor.

  2. A user can use memory format tag dnnl_format_tag_any for dst memory descriptor when creating pooling forward propagation. The library would derive the appropriate format from the src memory descriptor. However, the src itself must be defined. Similarly, a user can use memory format tag dnnl_format_tag_any for the diff_src memory descriptor when creating pooling backward propagation.

Data Type Support

The pooling primitive supports the following combinations of data types:

Propagation

Source

Destination

Accumulation data type (used for average pooling only)

forward / backward

f32

f32

f32

forward / backward

bf16

bf16

bf16

forward

f16

f16

f16

forward

s8

s8

s32

forward

u8

u8

s32

forward

s32

s32

s32

forward inference

s8

u8

s32

forward inference

u8

s8

s32

forward inference

s8

f16

f16

forward inference

u8

f16

f16

forward inference

f16

s8

f16

forward inference

f16

u8

f16

forward inference

s8

f32

f32

forward inference

u8

f32

f32

forward inference

f32

s8

f32

forward inference

f32

u8

f32

Warning

There might be hardware and/or implementation specific restrictions. Check Implementation Limitations section below.

Data Representation

Source, Destination, and Their Gradients

Like other CNN primitives, the pooling primitive expects data to be an \(N \times C \times W\) tensor for the 1D spatial case, an \(N \times C \times H \times W\) tensor for the 2D spatial case, and an \(N \times C \times D \times H \times W\) tensor for the 3D spatial case.

The pooling primitive is optimized for the following memory formats:

Spatial

Logical tensor

Data type

Implementations optimized for memory formats

1D

NCW

f32

dnnl_ncw ( dnnl_abc ), dnnl_nwc ( dnnl_acb ), optimized^

1D

NCW

s32, s8, u8

dnnl_nwc ( dnnl_acb ), optimized^

2D

NCHW

f32

dnnl_nchw ( dnnl_abcd ), dnnl_nhwc ( dnnl_acdb ), optimized^

2D

NCHW

s32, s8, u8

dnnl_nhwc ( dnnl_acdb ), optimized^

3D

NCDHW

f32

dnnl_ncdhw ( dnnl_abcde ), dnnl_ndhwc ( dnnl_acdeb ), optimized^

3D

NCDHW

s32, s8, u8

dnnl_ndhwc ( dnnl_acdeb ), optimized^

Here optimized^ means the format that comes out of any preceding compute-intensive primitive.

Post-ops and Attributes

Propagation

Type

Operation

Description

Restrictions

Forward

Post-op

Binary

Applies a Binary operation to the result

General binary post-op restrictions

Implementation Limitations

  1. Refer to Data Types for limitations related to data types support.

  2. CPU

    • Different data types of source and destination in forward inference are not supported.

Performance Tips

N/A

Examples

pooling_example_cpp - CPU/GPU

This C++ API example demonstrates how to create and execute a Pooling primitive in forward training propagation mode.