This is the fourth article in a series (links to the first , second and thirdarticles), devoted to the machine learning system based on the theory of lattices, called "VKF-system". The program uses algorithms based on Markov chains to generate the causes of the target property by calculating a random subset of similarities between some groups of learning objects. This article describes the representation of objects through bit strings in order to compute the similarities by bitwise multiplication of the corresponding representations. Objects with discrete features require some technique from Formal Concept Analysis. The case of objects with continuous features uses logistic regression, dividing the area of ββchange into subintervals using information theory and a representation corresponding to the convex hull of the compared intervals.
1 Discrete signs
, , - . , ""/"". 'null' ( '_' ), () .
. . , .
( , ), () .
β¨L,β§,β¨β© G () β§- M () β¨- . gImβgβ₯m (G,M,I) L(G,M,I), β¨L,β§,β¨β©.
xβL β¨L,β§,β¨β© β¨-, xβ β
y,zβL y<x z<x yβ¨z<x.
xβL β¨L,β§,β¨β© β§-, xβ T y,zβL x<y x<z x<yβ§z.
β§- , , β¨- , .
( . (L,L,β₯))
G\M | h | i | j | k |
---|---|---|---|---|
a | 1 | 1 | 1 | 0 |
b | 0 | 1 | 1 | 1 |
c | 1 | 1 | 0 | 0 |
d | 1 | 0 | 1 | 0 |
f | 0 | 1 | 0 | 1 |
g | 0 | 0 | 1 | 1 |
, .
, 121 , 24 !
, :
- .
- β₯ , ( β¨- ).
- (β¨-) .
- .
CPython-: 'vkfencoder' vkfencoder.XMLImport 'vkf' vkf.FCA. β : vkf.FCA MariaDB, vkfencoder.XMLImport XML .
2
. C4.5 .
, .
, , , . .
2.1
, . .
E=GβͺO G - O. [a,b)βR V:GβR G[a,b)={gβG:aβ€V(g)<b}, O[a,b)={gβO:aβ€V(g)<b}
E[a,b)={gβE:aβ€V(g)<b}.
[a,b)βR V:GβR
ent[a,b)=β|G[a,b)||E[a,b)|β log2(|G[a,b)||E[a,b)|)β|O[a,b)||E[a,b)|β log2(|O[a,b)||E[a,b)|)
a<r<b [a,b)βR V:GβR
inf[a,r,b)=|E[a,r)||E[a,b)|β ent[a,r)+|E[r,b)||E[a,b)|β ent[r,b).
β V=r .
V:GβR a=min{V} v0, vl+1 , b=max{V}. {v1<β¦<vl} .
2.2
2l, l β . ()
Ξ΄Vi(g)=1βV(g)β₯viΟVi(g)=1βV(g)<vi,
1β€iβ€l.
Ξ΄V1(g)β¦Ξ΄Vl(g)ΟV1(g)β¦ΟVl(g) V gβE.
, β .
Ξ΄(1)1β¦Ξ΄(1)lΟ(1)1β¦Ο(1)l viβ€V(A1)<vj Ξ΄(2)1β¦Ξ΄(2)lΟ(2)1β¦Ο(2)l vnβ€V(A2)<vm.
(Ξ΄(1)1β Ξ΄(2)1)β¦(Ξ΄(1)lβ Ξ΄(2)l)(Ο(1)1β Ο(2)1)β¦(Ο(1)lβ Ο(2)l)
min{vi,vn}β€V((A1βͺA2)β³)<max{vj,vm}.
, 0...00...0 min{V}β€V((A1βͺA2)β³)β€max{V}.
2.3
. ( 1). . , .
pi1β¨...β¨pik pi1+...+pik>Ο 0<Ο<1.
,
β c:Rdβ{0,1}, Rd β ( d ) {0,1} β .
, β¨βX,Kβ©βRdΓ{0,1},
pβX,K(βx,k)=pβX(βx)β pKβ£βX(kβ£βx),
pβX(βx) β () , a pKβ£βX(kβ£βx) β , .. βxβRd
pKβ£βX(kβ£βx)=P{K=kβ£βX=βx}...
c:Rdβ{0,1}
R(c)=P{c(βX)β K}...
b:Rdβ{0,1} pKβ£βX(kβ£βx)
b(βx)=1βpKβ£βX(1β£βx)>12>pKβ£βX(0β£βx)
b :
βc:Rdβ{0,1}[R(b)=P{b(βX)β K}β€R(c)]
pKβ£βX(1β£βx)=pβXβ£K(βxβ£1)β P{K=1}pβXβ£K(βxβ£1)β P{K=1}+pβXβ£K(βxβ£0)β P{K=0}==11+pβXβ£K(βxβ£0)β P{K=0}pβXβ£K(βxβ£1)β P{K=1}=11+exp{-a(βx)}=Ο(a(βx)),
a(βx)=logpβXβ£K(βxβ£1)β P{K=1}pβXβ£K(βxβ£0)β P{K=0} Ο(y)=11+exp{-y} β .
2.4
a(βx)=logpβXβ£K(βxβ£1)β P{K=1}pβXβ£K(βxβ£0)β P{K=0} βwTβ Ο(βx) Οi:RdβR (i=1,...,m) βwβRm.
β¨βx1,k1β©,...,β¨βxn,knβ© tj=2kj-1.
log{p(t1,...,tnβ£βx1,...,βxn,βw)}=-nβj=1log[1+exp{-tjmβi=1wiΟi(βxj)}]...
,
L(w1,...,wm)=-nβj=1log[1+exp{-tjmβi=1wiΟi(βxj)}]βmax
.
-
βwt+1=βwt-(ββwTββwL(βwt))-1β ββwL(βwt)...
sj=11+exp{tjβ (wTβ Ξ¦(xj))}
βL(βw)=-Ξ¦Tdiag(t1,...,tn)βs,ββL(βw)=Ξ¦TRΞ¦,
R=diag(s1(1-s1),s2(1-s2),...,sn(1-sn)) β
s1(1-s1),s2(1-s2),...,sn(1-sn) diag(t1,...,tn)βs β t1s1,t2s2,...,tnsn.
βwt+1=βwt+(Ξ¦TRΞ¦)-1Ξ¦Tdiag(t)βs=(Ξ¦TRΞ¦)-1Ξ¦TRβz,
βz=Ξ¦βwt+R-1diag(t1,...,tn)βs β .
, - -
βwt+1=(Ξ¦TRΞ¦+Ξ»β I)-1β (Ξ¦TRβz)...
"-" : 1 .
, . :
- Vk ,
R2=1-exp{2(L(w0,...,wk-1)-L(w0,...,wk-1,wk))/n}β₯Ο
Vk ,
1-L(w0,...,wk-1,wk)L(w0,...,wk-1)β₯Ο
"-" Wine Quality ( . ). . ( >7), .
( 2.3) "" "". ( ) , 0 1. " " "" .
But the situation with the pair ("pH", "alcohol") was radically different. The "alcohol" weight was positive while the "pH" weight was negative. But with the help of an obvious logical transformation, we got the implication ("pH"β "alcohol").
The author would like to thank his colleagues and students for their support and incentives.