LogSoftmax

The logsoftmax primitive performs softmax along a particular axis on data with arbitrary dimensions followed by the logarithm function. All other axes are treated as independent (batch).

In general form, the operation is defined by the following formulas. Variable names follow the standard Conventions.

Forward

The second form is used as more numerically stable:

\[\begin{split}\begin{array}{rcl} \dst(\overline{ou}, c, \overline{in}) & = & \ln\left({\frac { e^{\src(\overline{ou}, c, \overline{in}) - \nu(\overline{ou}, \overline{in})} } { \sum\limits_{ic} e^{\src(\overline{ou}, ic, \overline{in}) - \nu(\overline{ou}, \overline{in})} }}\right) \\ & = & \left(\src(\overline{ou}, c, \overline{in}) - \nu(\overline{ou}, \overline{in})\right) - \ln\left( \sum\limits_{ic} e^{\src(\overline{ou}, ic, \overline{in}) - \nu(\overline{ou}, \overline{in})} \right), \end{array}\end{split}\]

where

  • \(c\) axis over which the logsoftmax computation is computed on,

  • \(\overline{ou}\) is the outermost index (to the left of logsoftmax axis),

  • \(\overline{in}\) is the innermost index (to the right of logsoftmax axis), and

  • \(\nu\) is used to produce more accurate results and defined as:

\[\nu(\overline{ou}, \overline{in}) = \max\limits_{ic} \src(\overline{ou}, ic, \overline{in})\]

Difference Between Forward Training and Forward Inference

There is no difference between the forward_training and forward_inference propagation kinds.

Backward

The backward propagation computes \(\diffsrc(ou, c, in)\), based on \(\diffdst(ou, c, in)\) and \(\dst(ou, c, in)\).

Execution Arguments

When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.

Primitive input/output

Execution argument index

\(\src\)

DNNL_ARG_SRC

\(\dst\)

DNNL_ARG_DST

\(\diffsrc\)

DNNL_ARG_DIFF_SRC

\(\diffdst\)

DNNL_ARG_DIFF_DST

Operation Details

Both forward and backward propagation support in-place operations, meaning that src can be used as input and output for forward propagation, and diff_dst can be used as input and output for backward propagation. In case of in-place operation, the original data will be overwritten.

Post-ops and Attributes

The logsoftmax primitive does not support any post-ops or attributes.

Data Type Support

The logsoftmax primitive supports the following combinations of data types.

Note

Here we abbreviate data types names for readability. For example, dnnl::memory::data_type::f32 is abbreviated to f32.

Propagation

Source / Destination

forward / backward

bf16, f32

Data Representation

Source, Destination, and Their Gradients

The logsoftmax primitive works with arbitrary data tensors. There is no special meaning associated with any logical dimensions. However, the logsoftmax axis is typically referred to as channels (hence in formulas we use \(c\)).

API

struct dnnl::logsoftmax_forward : public dnnl::primitive

Logsoftmax forward propagation primitive.

Public Functions

logsoftmax_forward()

Default constructor. Produces an empty object.

logsoftmax_forward(const primitive_desc &pd)

Constructs a logsoftmax forward propagation primitive.

Parameters

pd – Primitive descriptor for a logsoftmax forward propagation primitive.

struct desc

Descriptor for a logsoftmax forward propagation primitive.

Public Functions

desc()

Default constructor. Produces an empty object.

desc(prop_kind aprop_kind, const memory::desc &data_desc, int logsoftmax_axis)

Constructs a descriptor for a logsoftmax forward propagation primitive.

Parameters
struct primitive_desc : public dnnl::primitive_desc

Primitive descriptor for a logsoftmax forward propagation primitive.

Public Functions

primitive_desc()

Default constructor. Produces an empty object.

primitive_desc(const desc &adesc, const engine &aengine, bool allow_empty = false)

Constructs a primitive descriptor for a logsoftmax forward propagation primitive.

Parameters
  • adesc – descriptor for a logsoftmax forward propagation primitive.

  • aengine – Engine to use.

  • allow_empty – A flag signifying whether construction is allowed to fail without throwing an exception. In this case an empty object will be produced. This flag is optional and defaults to false.

primitive_desc(const desc &adesc, const primitive_attr &attr, const engine &aengine, bool allow_empty = false)

Constructs a primitive descriptor for a logsoftmax forward propagation primitive.

Parameters
  • adesc – Descriptor for a logsoftmax forward propagation primitive.

  • aengine – Engine to use.

  • attr – Primitive attributes to use.

  • allow_empty – A flag signifying whether construction is allowed to fail without throwing an exception. In this case an empty object will be produced. This flag is optional and defaults to false.

memory::desc src_desc() const

Returns a source memory descriptor.

Returns

Source memory descriptor.

Returns

A zero memory descriptor if the primitive does not have a source parameter.

memory::desc dst_desc() const

Returns a destination memory descriptor.

Returns

Destination memory descriptor.

Returns

A zero memory descriptor if the primitive does not have a destination parameter.

struct dnnl::logsoftmax_backward : public dnnl::primitive

Logsoftmax backward propagation primitive.

Public Functions

logsoftmax_backward()

Default constructor. Produces an empty object.

logsoftmax_backward(const primitive_desc &pd)

Constructs a logsoftmax backward propagation primitive.

Parameters

pd – Primitive descriptor for a logsoftmax backward propagation primitive.

struct desc

Descriptor for a logsoftmax backward propagation primitive.

Public Functions

desc()

Default constructor. Produces an empty object.

desc(const memory::desc &diff_data_desc, const memory::desc &data_desc, int logsoftmax_axis)

Constructs a descriptor for a logsoftmax backward propagation primitive.

Parameters
  • diff_data_desc – Diff source and diff destination memory descriptors.

  • data_desc – Destination memory descriptor.

  • logsoftmax_axis – Axis over which softmax is computed.

struct primitive_desc : public dnnl::primitive_desc

Primitive descriptor for a logsoftmax backward propagation primitive.

Public Functions

primitive_desc()

Default constructor. Produces an empty object.

primitive_desc(const desc &adesc, const engine &aengine, const logsoftmax_forward::primitive_desc &hint_fwd_pd, bool allow_empty = false)

Constructs a primitive descriptor for a logsoftmax backward propagation primitive.

Parameters
  • adesc – Descriptor for a logsoftmax backward propagation primitive.

  • aengine – Engine to use.

  • hint_fwd_pd – Primitive descriptor for a logsoftmax forward propagation primitive. It is used as a hint for deciding which memory format to use.

  • allow_empty – A flag signifying whether construction is allowed to fail without throwing an exception. In this case an empty object will be produced. This flag is optional and defaults to false.

primitive_desc(const desc &adesc, const primitive_attr &attr, const engine &aengine, const logsoftmax_forward::primitive_desc &hint_fwd_pd, bool allow_empty = false)

Constructs a primitive descriptor for a logsoftmax backward propagation primitive.

Parameters
  • adesc – Descriptor for a logsoftmax backward propagation primitive.

  • attr – Primitive attributes to use.

  • aengine – Engine to use.

  • hint_fwd_pd – Primitive descriptor for a logsoftmax forward propagation primitive. It is used as a hint for deciding which memory format to use.

  • allow_empty – A flag signifying whether construction is allowed to fail without throwing an exception. In this case an empty object will be produced. This flag is optional and defaults to false.

memory::desc dst_desc() const

Returns a destination memory descriptor.

Returns

Destination memory descriptor.

Returns

A zero memory descriptor if the primitive does not have a destination parameter.

memory::desc diff_src_desc() const

Returns a diff source memory descriptor.

Returns

Diff source memory descriptor.

Returns

A zero memory descriptor if the primitive does not have a diff source memory with.

memory::desc diff_dst_desc() const

Returns a destination memory descriptor.

Returns

Destination memory descriptor.

Returns

A zero memory descriptor if the primitive does not have a destination parameter.