Authors of the article: Ph.D. S. B. Pshenichnikov, Ph.D. A.S. Valkov
Algebra and language (writing) are two different tools of knowledge. If we combine them, then we can count on the emergence of new methods of machine understanding. To determine the meaning (to understand) is to calculate how the part relates to the whole. Modern search algorithms already have the task of meaning recognition, and Google's tensor processors perform matrix multiplications (convolutions) necessary for the algebraic approach. At the same time, statistical methods are mainly used in semantic analysis. In algebra, it would seem strange to use statistics when searching, for example, for signs of divisibility of numbers. The use of the algebraic apparatus is also useful for interpreting the results of calculations when recognizing the meaning of a text.
A text is understood as a sequence of characters of an arbitrary nature. For example, natural languages, musical texts, genetic sequences of biopolymers, codes (code tables as relations of signs). In music texts written on a staff from one line (the "string" staff), signs are notes, keys, alliteration signs, volume and tempo indications. In genetic texts, the signs-words are triplets. So far, sign systems of taste and smell exist only as natural ones (like specimens, like a zoo). For touch, there is a bumpy-dot tactile Braille code. The hub of sign systems is semiotics [1] , which consists of three tags: semantics, syntactics and pragmatics.
An example of a language text:
A set is an object that is a set of objects. A polynomial is a set of monomial objects that are a set of multiplier objects. (one)
To turn text into a mathematical object, you need to coordinate it correctly. The text of the example can be lemmatized (if morphological forms are important for the task, lemmatization is optional) - brought to normal form: for nouns it is the nominative case, the singular; for adjectives - nominative, singular, masculine; for verbs, participles, gerunds - a verb in an imperfective infinitive:
()1,1 ()2,2 ()3,3 ()4,4 ()5,1 ()6,3 ("")7,7 ()8,8 ()9,2 ()10,1 ()11,3 ()12,12 ()13,4 ()14,1 ()15,3 ()16,16 ("")17,7 (2)
(2) . () , . – . . , . , (2). «» – ()1,1. ( ) . . . (2) - - 5,1: ()5,1. , – . , , . , . (2) ( ) 1 3. (...)5,5, (...)6,6 . ()5,1 ()6,3.
. ( – – ), . – - ( , ). , – . - «» . . – .
– . – , . , (), . (2) . (2):
()1,1 ()2,2 ()3,3 ()4,4 ("")7,7 ()8,8 ()12,12 ()16,16 (3)
(1) → (2) , , (...)i,j. , . - ·– («» «»), A N. . 24 . ( ) ( ):
, . . () ( ), . - , .
, . . Ei,j ( ) – , i j , . , n=2:
E1,2, E2,1 E1,1, E2,2 ( ). ( ), . , E1,1E2,1=0, E2,1E1,2=E2,2. . , . [2] .
(2) P ( ):
() (2) (5) , P - ( ). () (2) . (2) (5) «» ( -) «-».
(5) ( ) . . .
, (3) :
DR – P . P DR imax×imax, imax – () . P DR , . . DR . .
:
(5) , (2). , (, ), .
, F1, F2, …,Fk , . Fi () Fj (), Fij () , Fi=FijFj. . .
( ). (4) . n2 2(n-1) , (n2 – 2n – 2) – ( ).
– ( ), DR ( ).
– DR ( ), ( ).
, , , (). , , , .
. . .
. F1, F2, …,Fk () Fm , F1, F2, …,Fk Fm .
, . , . . . , . Fm. , Fm, Fm. , , . . . , , . , .
() , , , , .
– P , .
— . : (, , , ); (, , , , – ); ; , ; - (, , , , ).
– () . – -. - - , . ( ), .
(5) :
E – . , (5) (7). (DR+E5,1+E6,3) - . – (DR+E5,1+E6,3) () , ( ).
() (7) :
( ) E5,1 E6,3. E1,1 («») E3,3 («»), . , . . , . , ( ) .
. – (). – (). , – «». : , , . , - «» . – , , «».
DR (6) -. - n- ( -, ). , .
.
() .
– , , , DR. .
- . (, ).
– , . , (8) – ( ) « » « ».
, , , , - , , , ( ). – E8,8+E12,12+E16,16 (7) – ()8,8... ()12,12...()16,16 – F2 F1. () « » « ».
, . , , , ( ), , , .
. – – . – .
For restructuring, an algebraic structuring of the language corpus is necessary to compile the above dictionaries of the language corpus. In this case, ideals and residue classes of the matrix ring P txt of the corpus of matrix texts must be preliminarily constructed and investigated.
A more rigorous and general description of the algebra of a text is given in [3] .