einsum¶
- paddle. einsum ( equation, *operands ) [source]
-
The current version of this API should be used in dynamic graph only mode.
Einsum offers a tensor operation API which allows using the Einstein summation convention or Einstain notation. It takes as input one or multiple tensors and produces as output one tensor.
Einsum is able to perform a variety of tensor operations. Following lists a few:
-
- for single operand
-
trace
diagonal
transpose
sum
-
- for double operands
-
dot
outer
broadcasting and elementwise multiply
matrix multiply
batched matrix multiply
-
- for many operads
-
broadcasting multiply
chained matrix multiply
The summation notation
The tensor dimensions are labeled using uncased English letters. E.g., ijk relates to a three dimensional tensor whose dimensions are labeled i, j, and k.
The equation is , separated into terms, each being a distinct input’s dimension label string.
Ellipsis … enables broadcasting by automatically converting the unlabeled dimensions into broadcasting dimensions.
Singular labels are called free labels, duplicate are dummy labels. Dummy labeled dimensions will be reduced and removed in the output.
-
- Output labels can be explicitly specified on the right hand side of -> or omitted.
-
- In the latter case, the output labels will be inferred from the input labels.
-
-
- Inference of output labels
-
Broadcasting label …, if present, is put on the leftmost position.
Free labels are reordered alphabetically and put after ….
-
- On explicit output labels
-
If broadcasting is enabled, then … must be present.
-
- The output labels can be an empty, an indication to output as a scalar
-
the sum over the original output.
Non-input labels are invalid.
Duplicate labels are invalid.
-
- For any dummy label which is present for the output, it’s promoted to
-
a free label.
-
- For any free label which is not present for the output, it’s lowered to
-
a dummy label.
-
-
- Examples
-
‘…ij, …jk’, where i and k are free labels, j is dummy. The output label string is ‘…ik’
‘ij -> i’, where i is a free label and j is a dummy label.
‘…ij, …jk -> …ijk’, where i, j and k are all free labels.
‘…ij, …jk -> ij’, an invalid equation since … is not present for the output.
The summation rule
The summation procedure can be outlined as follows, although the actual steps taken may vary significantly due to implementation specific optimization.
Step 1: preparation for broadcasting, that is, transposing and unsqueezing the input operands to have each resulting dimension identically labeled across all the input operands.
Step 2: broadcasting multiply all the resulting operands from step 1.
Step 3: reducing dummy labeled dimensions.
Step 4: transposing the result tensor to match the output labels.
On trace and diagonal
The trace and diagonal are planned yet unimplemented features.
- Parameters
-
equation (str) – The summation terms using the Einstein summation notation.
operands (list|Tensor) – The input tensors over which to compute the Einstein summation. The number of operands should equal the number of input terms in the equation.
- Returns
-
result (Tensor), the result tensor.
Examples
>>> import paddle >>> paddle.seed(102) >>> x = paddle.rand([4]) >>> y = paddle.rand([5]) >>> # sum >>> print(paddle.einsum('i->', x)) Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True, 1.81225157) >>> # dot >>> print(paddle.einsum('i,i->', x, x)) Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True, 1.13530672) >>> # outer >>> print(paddle.einsum("i,j->ij", x, y)) Tensor(shape=[4, 5], dtype=float32, place=Place(cpu), stop_gradient=True, [[0.26443148, 0.05962684, 0.25360870, 0.21900642, 0.56994802], [0.20955276, 0.04725220, 0.20097610, 0.17355499, 0.45166403], [0.35836059, 0.08080698, 0.34369346, 0.29680005, 0.77240014], [0.00484230, 0.00109189, 0.00464411, 0.00401047, 0.01043695]]) >>> A = paddle.rand([2, 3, 2]) >>> B = paddle.rand([2, 2, 3]) >>> # transpose >>> print(paddle.einsum('ijk->kji', A)) Tensor(shape=[2, 3, 2], dtype=float32, place=Place(cpu), stop_gradient=True, [[[0.50882483, 0.56067896], [0.84598064, 0.36310029], [0.55289471, 0.33273944]], [[0.04836850, 0.73811269], [0.29769155, 0.28137168], [0.84636718, 0.67521429]]]) >>> # batch matrix multiplication >>> print(paddle.einsum('ijk, ikl->ijl', A,B)) Tensor(shape=[2, 3, 3], dtype=float32, place=Place(cpu), stop_gradient=True, [[[0.36321065, 0.42009076, 0.40849245], [0.74353045, 0.79189068, 0.81345987], [0.90488225, 0.79786193, 0.93451476]], [[0.12680580, 1.06945944, 0.79821426], [0.07774551, 0.55068684, 0.44512171], [0.08053084, 0.80583858, 0.56031936]]]) >>> # Ellipsis transpose >>> print(paddle.einsum('...jk->...kj', A)) Tensor(shape=[2, 2, 3], dtype=float32, place=Place(cpu), stop_gradient=True, [[[0.50882483, 0.84598064, 0.55289471], [0.04836850, 0.29769155, 0.84636718]], [[0.56067896, 0.36310029, 0.33273944], [0.73811269, 0.28137168, 0.67521429]]]) >>> # Ellipsis batch matrix multiplication >>> print(paddle.einsum('...jk, ...kl->...jl', A,B)) Tensor(shape=[2, 3, 3], dtype=float32, place=Place(cpu), stop_gradient=True, [[[0.36321065, 0.42009076, 0.40849245], [0.74353045, 0.79189068, 0.81345987], [0.90488225, 0.79786193, 0.93451476]], [[0.12680580, 1.06945944, 0.79821426], [0.07774551, 0.55068684, 0.44512171], [0.08053084, 0.80583858, 0.56031936]]])
-