Hello! We - scientists at ITMO's Machine Learning laboratory and the Core ML team on VKontakte - are doing joint research. One of the important tasks of VK is the automatic classification of posts: it is necessary not only to generate thematic feeds, but also to identify unwanted content. Assessors are involved for such processing of records. At the same time, the cost of their work can be significantly reduced using such a machine learning paradigm as active learning.
It is about its application for the classification of multimodal data that will be discussed in this article. We will tell you about the general principles and methods of active learning, the peculiarities of their application to the task, as well as the insights obtained during the research.
Introduction
โ machine learning, . , , , .
, (, Amazon Mechanical Turk, .) . โ reCAPTCHA, , , , โ Google Street View. โ .
Amazon DALC (Deep Active Learning from targeted Crowds). , . Monte Carlo Dropout ( ). โ noisy annotation. , ยซ , ยป, .
Amazon . : / . , , . , : , . .
โ ! , . pool-based sampling.
. 1. pool-based
. , , ( ). : , .
, โ . (. โ query). , . ( , ) .
, , .
, โ . ( ). โ250 . . () 50 โ โ :
- , (. embedding), ;
- .
, (. . 2).
. 2 โ
ML โ . , .
. , . , , , . , , early stopping. , .
. residual , highway , (. encoder). , (. fusion): , .
โ , . -.
, โ , . , .
. , (. 3):
. 3.
. , . , , . , ( + ) โ .
, . 3, :
. 4.
, , . , รณ , , .
, : ? :
- ;
- ;
- .
. : maximum likelihood , - . :
โ ( -), โ , .
Pool-based sampling
โ , . pool-based sampling :
- - .
- .
- , , .
- .
- ( ).
- 3โ5 (, ).
, 3โ6 โ .
, , :
, . , : . , , , . . , 2โ000.
. , . ( ). , , . , . 20 .
. , . โ , . 100 200.
, , , .
โ1: batch size
baseline , ( ) (. 5).
. 5. baseline- .
random state. .
. ยซยป , , .
, (. batch size). 512 โ - (50). , batch size . . :
- upsample, ;
- , .
batch size: (1).
โ batch size, โ .
โโ (. 6).
. 6. batch size (passive ) (passive + flexible )
: c . , , batch size . .
.
Uncertainty
โ uncertainty sampling. , , .
:
1. (. Least confident sampling)
, :
โ , โ , โ , โ , .
. , . , . .
. , : {0,5; 0,49; 0,01}, โ {0,49; 0,255; 0,255}. , (0,49) , (0,5). , รณ : . , .
2. (. Margin sampling)
, , , :
โ , โ .
, . , . , , MNIST ( ) โ , . .
3. (. Entropy sampling)
:
โ - .
, , . :
- , , ;
- , .
(. 7).
. 7. uncertainty sampling ( โ , โ , โ )
, least confident entropy sampling , . margin sampling .
, , : MNIST. , , entropy sampling , . , .
. , โ , โ . , .
BALD
, , โ BALD sampling (Bayesian Active Learning by Disagreement). .
, query-by-committee (QBC). โ . uncertainty sampling. , . QBC Monte Carlo Dropout, .
, , โ . dropout . dropout , ( ). , dropout- (. 8). Monte Carlo Dropout (MC Dropout) . , . ( dropout) Mutual Information (MI). MI , , โ , . .
. 8. MC Dropout BALD
, QBC MC Dropout uncertainty sampling. , (. 9).
. 9. uncertainty sampling ( QBC ) ( โ , โ , โ )
BALD. , Mutual Information :
โ , โ .
(5) , โ . , , . BALD . 10.
. 10. BALD
, , .
query-by-committee BALD , . , uncertainty sampling. , โ , โ , โ , โ , .
BALD tf.keras, . PyTorch, dropout , batch normalization , .
โ2: batch normalization
batch normalization. batch normalization โ , . , , , , batch normalization. , . , . BALD. (. 11).
. 11. batch normalization BALD
, , .
batch normalization, . , .
Learning loss
. , . , .
, . โ . , . learning loss, . , (. 12).
. 12. Learning loss
learning loss . .
. , . ยซยป learning loss: , , . ideal learning loss (. 13).
. 13. ideal learning loss
, learning loss.
, . , , - , . :
- (2000 ), ;
- 10000 ( );
- ;
- ;
- 100 ;
- , , 1;
- .
, , . , ( margin sampling).
1.
p-value | ||
---|---|---|
loss | -0,2518 | 0,0115 |
margin | 0,2461 | 0,0136 |
, margin sampling โ , , , . c .
: ?
, , (. 14).
. 14. ideal learning loss ideal learning loss
, MNIST :
2. MNIST
p-value | ||
---|---|---|
loss | 0,2140 | 0,0326 |
0,2040 | 0,0418 |
ideal learning loss , (. 15).
. 15. MNIST ideal learning loss. โ ideal learning loss, โ
, , , , . .
learning loss , uncertainty sampling: , โ , โ . , , . , .
, . . , margin sampling โ . 16.
. 16. ( ) , margin sampling
: ( โ margin sampling), โ , , . โ25 . . 25% โ .
, . , , .
, , . , :
- batch size;
- , , โ , batch normalization.