The binary primitive computes the result of a binary elementwise operation between tensors source 0 and source 1 (the variable names follow the standard Naming Conventions):
\[ \dst(\overline{x}) = \src_0(\overline{x}) \mathbin{op} \src_1(\overline{x}), \]
where \(op\) is one of addition, subtraction, multiplication, division, greater than or equal to, greater than, less than or equal to, less than, equal to, not equal to, get maximum value, and get minimum value.
The binary primitive does not have a notion of forward or backward propagations.
When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.
Primitive input/output | Execution argument index |
---|---|
\(\src_0\) | DNNL_ARG_SRC_0 |
\(\src_1\) | DNNL_ARG_SRC_1 |
\(\dst\) | DNNL_ARG_DST |
\(\text{binary post-op}\) | DNNL_ARG_ATTR_MULTIPLE_POST_OP(binary_post_op_position) | DNNL_ARG_SRC_1 |
{N,1}x{C,1}x{D,1}x{H,1}x{W,1}:{N,1}x{C,1}x{D,1}x{H,1}x{W,1} -> NxCxDxHxW
. It is consistent with PyTorch broadcast semantic.The following attributes are supported:
Type | Operation | Description | Restrictions |
---|---|---|---|
Attribute | Scales | Scales the corresponding input tensor by the given scale factor(s). | Only one scale per tensor is supported. Input tensors only. |
Post-op | Sum | Adds the operation result to the destination tensor instead of overwriting it. | |
Post-op | Eltwise | Applies an Eltwise operation to the result. | |
Post-op | Binary | Applies a Binary operation to the result | General binary post-op restrictions |
The source and destination tensors may have f32
, bf16
, f16
or s8/u8
data types. The binary primitive supports the following combinations of data types:
Source 0 / 1 | Destination |
---|---|
bf16 | bf16 |
s8, u8, f16, f32 | s8, u8, f16, f32 |
The binary primitive works with arbitrary data tensors. There is no special meaning associated with any of tensors dimensions.
f32
destination type source 0 and source 1 tensors must have f32
data type.Engine | Name | Comments |
---|---|---|
CPU/GPU | Binary Primitive Example | This C++ API example demonstrates how to create and execute a Binary primitive. Key optimizations included in this example:
|
CPU/GPU | Bnorm u8 by binary post-ops example | Bnorm u8 via binary postops example. |