Surprisingly, in the Russian-speaking segment of the Internet, there is almost no material that explains Einstein's summation agreement in clear language . It is no less surprising that there are even fewer materials to understand the principle of the einsum function on the Russian-speaking Internet. In English there is a rather detailed answer about the work of einsum on the stack overflow, and in Russian there are only a few sites that provide a curve translation of this answer. I want to fix this problem with a lack of materials, and invite everyone who is interested to read it!
Discussing the Einstein Agreement
First of all, I would like to note that Einstein's agreement is most often used in tensor analysis and its applications, therefore, further in the article there will be several references to tensors.
When you just start working with tensors, you may be confused that in addition to the usual subscripts, superscripts are also used, which at first can generally be taken for exponentiation. Example:
"a with superscript i" will be written as , and "a in a square with superscript i" will be written . It may be confusing and uncomfortable at first, but you can get used to it over time.
Agreement: further in the article, objects of the type or I will call them terms .
What is Einstein's agreement about?
Einstein's agreement is designed to reduce the number of summation signs in an expression. There are three simple rules that determine how correct an expression is written in Einstein's notation.
Rule # 1: Summation is carried out over all indices that are repeated twice in one term.
Example: Consider an expression like this:
Using Einstein's convention, this expression can be rewritten like this:
Thus, we get rid of the sum sign, and just write a single term. Note that in this term the index i is repeated twice, which means, in accordance with the first rule, we understand that the summation is carried out over the index i, or, more precisely, over all possible values ββthat this index takes.
: . . :
:
, i , j , , j.
1. , , .
2. , .
, , ,
.
, .
, Python:
for i in range(M):
for j in range(N):
b[i] += A[i, j] * v[j]
β 2. .
, , , , .
:
β i , .. ;
β i , j β ;
β i, j ;
β i, j ;
β ( i );
, , , . :
, , . , , , i 3 , j, , , ( ), , .
β 3. , .
:
β , i , ;
β . : k j , , , i , , . k , i β , , k β , i β . i , , . : i , , 3 .
, :
β i , i j;
β j, i. ;
β i, i, j;
:
β . , :
, , . β . , .
, , , . , , , , . , , . , , . :
, !
einsum
einsum , Python (NumPy, TensorFlow, PyTorch). , , ( , ), , einsum . NumPy. einsum . , , , , .
: , β , . , , :
. , :
M = np.zeros((3, 2))
for i in range(3):
for j in range(2):
for k in range(5):
M[i, j] += A[i, k] * B[k, j]
, einsum :
M = np.einsum("ik,kj->ij", A, B)
, . einsum : , . :
"{, },{, }->{, }"
einsum :
( ), ;
, ;
3 ;
, einsum , , , . , , , , einsum . , einsum.
, einsum:
einsum,
1. :
vector = np.array([1, 2, 3, 4, 5])
result = np.einsum("i->", vector)
print(result)
Output
15
2. :
matrix = np.array([[1, 2], [3, 4], [5, 6]])
result = np.einsum("ij->", matrix)
print(result)
Output
21
3. :
matrix = np.array([[1, 2], [3, 4], [5, 6]])
result = np.einsum("ij->j", matrix)
print(result)
Output
[9, 12]
4. :
matrix = np.array([[1, 2], [3, 4], [5, 6]])
result = np.einsum("ij->i", matrix)
print(result)
Output
[3, 7, 11]
5. ( , , , ):
matrix = np.array([[1, 2], [3, 4], [5, 6]])
result = np.einsum("ij->ji", matrix)
print(result)
Output
[[1, 3, 5], [2, 4, 6]]
6. :
matrix = np.array([[1, 2], [3, 4], [5, 6]])
vector = np.array([[1, 2]])
result = np.einsum("ij,kj->ik", matrix, vector)
print(result)
, , , . einsum , , , .
Output
[[5], [11], [17]]
7. :
matrix1 = np.array([[1, 2], [3, 4], [5, 6]])
matrix2 = np.array([[1, 0], [0, 1]])
result = np.einsum("ik,kj->ij", matrix1, matrix2)
print(result)
Output
[[1, 2], [3, 4], [5, 6]]
8. :
vector1 = np.array([[1, 2, 3]])
vector2 = np.array([[1, 1, 1]])
result = np.einsum("ik,jk->", vector1, vector2)
print(result)
Output
6
9. :
matrix1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
result = np.einsum("ii->", matrix1)
print(result)
Output
15
10. () :
matrix1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
matrix2 = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
result = np.einsum("ij,ij->ij", matrix1, matrix2)
print(result)
, , : , einsum β :
result = np.zeros(matrix1.shape, dtype="int32")
for i in range(result.shape[0]):
for j in range(result.shape[1]):
result[i, j] += matrix1[i, j] * matrix2[i, j]
print(result)
Output
[[1, 0, 0], [0, 5, 0], [0, 0, 9]]
11. () :
vector1 = np.array([1, 2, 3])
vector2 = np.array([1, 0, 0])
result = np.einsum("i,j->ij", vector1, vector2)
print(result)
Output
[[1, 0, 0], [2, 0, 0], [3, 0, 0]]
12. :
A = np.array([[[0, 1], [1, 2], [2, 3]], [[1, 2], [2, 3], [3, 4]], [[2, 3], [3, 4], [4, 5]]])
result = np.einsum("ijk->jki", A)
print(result)
Output
[[[0, 1, 2], [1, 2, 3]], [[1, 2, 3], [2, 3, 4]], [[2, 3, 4], [3, 4, 5]]]
13. :
A = np.array([[[0, 1], [1, 2], [2, 3]], [[1, 2], [2, 3], [3, 4]], [[2, 3], [3, 4], [4, 5]]])
U = np.array([[1, 2], [2, 3]])
result = np.einsum("ijk,nk->ijn", A, U)
print(result)
Output
[[[2, 3], [5, 8], [8, 13]], [[5, 8], [8, 13], [11. 18]], [[8, 13], [11, 18], [14, 23]]]
, einsum . , (np.dot, np.outer, np.tensordot, np.transpose, np.cumsum ..), einsum. , , , , , .
Einstein's Agreement (base)
Einstein's agreement (advanced part)