For three years I worked in Serbia as an iOS evangelist - there were two specialized projects and one Machine Learning.
If you are interested, welcome to the world of HMM.
Formulation of the problem
Austrian bank. He has many clients, clients have an account with this bank. During the year, the client spends funds from his account. He goes to shops, extinguishes utility bills, etc. Each withdrawal of money from an account is called a transaction. A sequence of transactions is given for a certain time (say, a year). It is necessary to train the machine so that it starts checking new transactions as valid or suspicious. And issued a warning in the latter case. To solve the problem, you need to use the Hidden Markov Model.
Introduction to HMM
I get coronavirus every year for 10 days in a row. The rest of the days he is as healthy as a bull.
Let's represent this sequence of 365 characters as an array. h means healthy, l means sick.
days{365} = {hhhhhhhhhhllllllllllhhhhhhhhhhhhhhhhhhhhhhhh...hhhhh}
Question - What is the probability that I am sick today?
= 3 percent
, , 15 HMM. - .
- , ?
: - ?
( - 10), = 90 10 .
? -
= 0.3 99.7% .
, 10% 90% .
4 , 2 2 - ! . , 0 1, .
|
|
|
|
0.997 |
0.003 |
|
0.10 |
0.90 |
, , 0.997 , 0.003 .
/? .
, .
27.10.2020 00:00 GAZPROMNEFT AZS 219 2507,43 118 753,95 28.10.2020 / 298380
26.10.2020 14:45 SPAR 77 319,73 121 261,38 27.10.2020 / 220146
26.10.2020 14:38 ATM 60006475 4800,00 121 581,11 26.10.2020 / 213074
25.10.2020 17:52 EUROSPAR 18 970,02 126 381,11 26.10.2020 / 259110
25.10.2020 00:00 Tinkoff Card2Card 20000,00 127 351,13 26.10.2020 / 253237
22.10.2020 14:22 SBOL 4276 7000,00 147 351,13 22.10.2020 / 276951
22.10.2020 12:18 STOLOVAYA 185,00 154 351,13 23.10.2020 / 279502
21.10.2020 16:46 MEGAFON R9290499831 500,00 154 536,13 21.10.2020 / 224592 , , .
21.10.2020 14:17 SPAR 77 987,03 155 036,13 22.10.2020 / 219015
21.10.2020 13:42 PYATEROCHKA 646 289,93 156 023,16 22.10.2020 / 294539
21.10.2020 00:00 MEBEL 75,00 156 313,09 22.10.2020 / 279935
19.10.2020 14:54 SPAR 77 552,92 132 044,80 20.10.2020 / 208987
19.10.2020 00:00 MOBILE FEE 60,00 132 597,72 20.10.2020 / -
16.10.2020 14:19 SPAR 77 579,39 132 657,72 17.10.2020 / 229627
12.10.2020 13:33 STOLOVAYA 185,00 133 237,11 13.10.2020 / 261374
12.10.2020 00:00 OOO MASTERHOST 1000,00 133 422,11 13.10.2020 / 268065
11.10.2020 12:09 SPAR 77 782,87 134 422,11 12.10.2020 / 275816
10.10.2020 14:52 SBOL 400,00 135 204,98 10.10.2020 / 276925
09.10.2020 13:29 SBOL 5484* 1000,00 135 604,98 09.10.2020 / 229184
09.10.2020 11:55 MAGNIT MK KRYUCHYA 274,00 136 604,98 10.10.2020 / 209914
,
def readtrans():
with open ("assets/trans.txt", "r") as file:
grades = file.read()
pattern = '(\d{2,5}),\d\d'
result = re.findall(pattern, grades)
r = list(map(int, result[0::2]))
return r
data = readtrans()
t = list(range(len(data)))
df = pd.DataFrame({'number':t, 'amount':data})
ax1 = df.plot.bar(x='number', y='amount', rot=0, width=1.5)
![](https://habrastorage.org/getpro/habr/upload_files/2a5/52e/9a0/2a552e9a04970c55a2ced821c25b1e1a.png)
- ( 10$) l, 100$ h, - m.
print(observations[:20])
trans[] = ['m', 'm', 'm', 'l', 'm', 'm', 'h', 'm', 'l', 'l', 'm', 'l', 'l', 'l', 'l', 'l', 'l', 'm', 'l', 'l']
. 3 3, 3 = {l,m,h}
[[0.5 0.3 0.2]
[0.6 0.3 0.1]
[0.7 0.3 0.0]]
- , 0.7 , 0.3 - .
, . - . - .
- ?! - . , . (), , . .
, , . - , , , , ...
, . , ?! . , 4-6 . . . -. . , 300 .
, 5 5 ( 5 5) 20 .
[[a1 a2 a3 a4 a5] [b1 b2 b3 b4 b5] [c1 c2 c3 c4 c5] [x1 x2 x3 x4 x5] [y1 y2 y3 y4 y5]]
20, 25 ( ). , , 5 .
( ) 5 3.
? , a ( )
l-, m-, h-.
[0.96 0.04 0.0]
100 . .
, , 20 10 .
20+10 , !
!
, .
hmm, - , . , 15-20 , .
![](https://habrastorage.org/getpro/habr/upload_files/520/888/348/520888348421bc77e1d47831670ccee4.png)
.
.
Accord C#
using Accord.MachineLearning;
using Accord.Statistics.Models.Markov;
using Accord.Statistics.Models.Markov.Learning;
using Accord.Statistics.Models.Markov.Topology;
using Comtrade.FMS.Common;
, ( ) . -. , run- )) . 2010 .
I will give one line of code in which the learning method is encrypted.
var teacher = new BaumWelchLearning (hmm)
You will understand the details of the Baum-Welch method by reading the relevant literature and tuning your brains to a stat. processes.
I wish you success and a good career in banking IT structures!