🌰 🏈 🛃 Lattice-Based Machine Learning Mathematics ↪️ 🍅 👩🏼‍⚕️

This is the third article in a series of papers (links to the first and second papers) describing a machine learning system based on lattice theory, entitled "VKF-system". It uses a structural (lattice-theoretic) approach to the presentation of training examples and their fragments, considered as the causes of the target property. The system calculates these fragments as similarities between some subsets of the training examples. There is an algebraic theory of such representations called Formal Concept Analysis (AFP).

However, the described system uses probabilistic algorithms to eliminate the disadvantages of the unlimited approach. Details below ...

AFP applications

Introduction

We will start by demonstrating our approach as applied to a school problem.

, .

, : ( ) .

, , ( ).

— .

, :

" " (A),

" " (B),

" " (C),

" " (D),

" " (E).

.

	A	B	C	D	E
1	1	0	1	1	1
1	0	1	0	0	1
0	1	0	1	1	0
0	0	1	0	1	0
?	1	0	1	0	1

( ) ( ) .

$\langle\{,\}, \{E\}\rangle,$

( , ), — .

$\{E\}$ $\{A,C,E\}$ , , .. . -. ( ), , .

$\langle\{,\}, \{D\}\rangle,$

, .

.., .., .. (.). . 2: , M.: URSS, 2020, 238 . ISBN 978-5-382-01977-2

, " -", $\{D\}$ $\{A,C,D,E\}$ ().

1.

- . , " - ". . ( ) , .

(= ) — $(G,M,I)$ , $G$ $M$ — , $I \subseteq G \times M$ . $G$ $M$ , . , $gIm$ $\langle g,m\rangle \in I$ , , $g$ $m$ .

$A\subseteq G$ $B\subseteq M$

$A' = \{ m \in M | \forall g \in A (gIm) \}, \\ B' = \{ g \in G | \forall m \in B (gIm) \};$

$A'$ — , $A$ , $B'$ — , $B$ . $(\cdot)': 2^G \rightarrow 2^M$ $(\cdot)':2^M\rightarrow 2^G$ (= ) $(G,M,I)$ .

(= ) $(G,M,I)$ $\langle A,B \rangle$ , $A\subseteq G$ , $B\subseteq M$ , $A'=B$ $B'=A$ . $A$ $\langle A,B \rangle$ (=) , $B$ (=). $(G,M,I)$ $L(G,M,I)$ .

, $L(G,M,I)$

$\langle A_{1},B_{1}\rangle\vee\langle A_{2},B_{2}\rangle= \langle(A_{1}\cup A_{2})'',B_{1}\cap B_{2}\rangle, \\ \langle A_{1},B_{1}\rangle\wedge\langle A_{2},B_{2}\rangle= \langle A_{1}\cap A_{2},(B_{1}\cup B_{2})''\rangle.$

: $\langle A,B \rangle\in L(G,M,I)$ , $g\in G$ $m\in M$

$CbO(\langle A,B\rangle,g) = \langle(A\cup\{g\})'',B\cap\{g\}'\rangle,\\ CbO(\langle A,B\rangle,m) = \langle A\cap\{m\}',(B\cup\{m\})''\rangle.$

CbO, "--" (Close-by-One (CbO)), $L(G,M,I)$ .

CbO

$(G,M,I)$ — , $\langle A_{1},B_{1}\rangle, \langle A_{2},B_{2}\rangle\in L(G,M,I)$ , $g\in G$ $m\in M$ .

$\langle A_{1},B_{1}\rangle\leq \langle A_{2},B_{2}\rangle\Rightarrow CbO(\langle A_{1},B_{1}\rangle,g)\leq CbO(\langle A_{2},B_{2}\rangle,g), \\ \langle A_{1},B_{1}\rangle\leq\langle A_{2},B_{2}\rangle\Rightarrow CbO(\langle A_{1},B_{1}\rangle,g)\leq CbO(\langle A_{2},B_{2}\rangle,g).$

2.

, :

( ) .
(NP-).
.
'' , .

1 , ( ):

$M\\G$	$m_{1}$	$m_{2}$	$\ldots$	$m_{n}$
$g_{1}$	0	1	$\ldots$	1
$g_{2}$	1	0	$\ldots$	1
$\vdots$	$\vdots$	$\vdots$	$\ddots$	$\vdots$
$g_{n}$	1	1	$\ldots$	0

, $\langle G\setminus \{g_{i_{1}},\ldots,g_{i_{k}}\},\{m_{i_{1}},\ldots,m_{i_{k}}\}\rangle$ . $2^n$ .

, , $n=32$ , 128 , $2^n$ $2^{37}$ , .. 16 !

2 . .. (- ).

3 4 . , "" -, . — , , "" -

$1-e^{-a}-a\cdot{e^{-a}}\cdot\left[1-e^{-c\cdot\sqrt{a}}\right],$

( ... ) $p=\sqrt{a/n}\to 0$ , - $m=c\cdot\sqrt{n}\to\infty$ , $n\to\infty$ .

, $1-e^{-a}-a\cdot{e^{-a}}$ , , $a$ >1.

3.

- . ( - ).

, , , .

(, , , ). .

, , (-).

input:  (G,M,I),   CbO( , )
result:   <A,B>
X=G U M; 
A = M'; B = M;  
C = G; D = G';
while (A!=C || B!= D) {
           x  X;
        <A,B> = CbO(<A,B>,x);
        <C,D> = CbO(<C,D>,x);
}

. , ( )

$\frac{(n+k)^2}{2k\cdot n}=2+\frac{(n-k)^2}{2k\cdot n} \geq 2$

, $n$ — , $k$ — .

, .. .

4. -

, , .

$(G,M,I)$ . $O$ - ( -).

$T$ .

, $\langle A,B\rangle\in L(G,M,I)$ . - VKF-hypothesis $\langle A,B\rangle$ , - $o\in O$ , $B\subseteq \{o\}'$ .

input:  N -  
result:   S   
while (i<N) {
           <A,B>  (G,M,I);
        hasObstacle = false;
        for (o in O) {
            if (B   {o}') hasObstacle = true;
        }
        if (hasObstacle == false) {
                S = S U {<A,B>};
                i = i+1;
        }
}

$B\subseteq\{o\}'$ $B$ $\langle{A,B}\rangle$ ( ) - $o$ .

, -.

(, "--") , - .

input:  T       
input:   S   -
for (x in T) {
        target(x) = false;
        for (<A,B> in S) {
            if (B is a part of {x}') target(x) = true;
        }
}

, - .

$x$ $\varepsilon$ -, - $\langle A,B\rangle$ $B\subseteq\{x\}'$ $\varepsilon$ .

$N$ , .

$n=\left|{M}\right|$ , $\varepsilon>0$ $1>\delta>0$ $S$ -

$N\geq{\frac{2\cdot(n+1)-2\cdot\log_{2}{\delta} }{\varepsilon}}$

$>{1-\delta}$ , $\varepsilon$ - $x$ - $\langle A,B\rangle\in{S}$ , .. $B\subseteq\{x\}'$ .

. .. . .. .

, . "-" . .. .

Discrete features will again require some AFP technique. Continuous features will require logistic regression, entropy principles for dividing ranges into subintervals, and a convex hull representation of the intervals whose similarity is calculated.

The author is pleased to have this opportunity to thank his colleagues and students for their support and incentives.

Lattice-Based Machine Learning Mathematics

Introduction

1.

2.

3.

4. -

More articles: