Good day everyone! The annual international AI Contest, organized by Sberbank together with Russian and foreign partners, within the framework of the Artificial Intelligence Journey conference, has just ended . This year's tasks: Digital Peter: recognition of the manuscripts of Peter I , NoFloodWithAI: floods on the Amur River and AI 4 Humanities: ruGPT-3 . This time about 1000 people from 43 countries took part in the competition.
Our team took part in solving the problem "Digital Peter: recognition of manuscripts of Peter I" and won first place. I would like to tell you about what we did in the process of solving the competition, who is Dad here , what tricks and tricks we used. There is a lot of information, there will be a lot of special words for those who are not in the subject. This is not a tutorial, I will not describe in great detail, but I will be happy to answer questions in the comments.
You can look at the dream team
Plan
Description of the task
Data format, available resources and limitations
, : , I, (. ). , , - .
.
, - , - , , .
500 , , , , .
1.
, ( OOF), . ( ), ( ), , +90, -90 . (Resnet34 ) . , .
, .. . .
2.
, CTCLoss Attention. CTCLoss , Attention . CTCLoss, , Attention . .
Bs - , (w, h, c) - (, , ). . Hidden size - LSTM . Dict Size - , . Dense - Keras, Linear PyTorch.
3.
, . : ToGray, CLAHE, Rotate, CutOut.
CutOut . , HandWrittenBlots, , , . , ( ) . CutOut , HandWrittenBlots . Augmixations. .
P.S. CutOut , .
4. CharMasks
, , CTC Loss. , , , ( , ). ( Action Labeling ).
. , , . , . XVII-XVIII (, ). , , .
, , , , . . , . . (Multi Word Expression) ( ) .
, , .. , . - :
5. Spell correction using XLMRoberta
, .
, ( , ). NLP. XLMRoberta XVII-XVIII .., I. :
1. OCR ( ) ( ) ( + softmax), 3 (//blank ..) ;
2. : 3-4 , - .. //blank, , . zero-shot learning, , . OCR ('': 'p', '': 'o', '': 'e', '': 'c', '': 'a', '': 'x', '': 'u', '': ‘k’);
3. OCR step by step (!), ;
4. : ( 0 12), 50% padding ( ), 10% . ( ). XLMRoberta outputhiddenstates - NER, ;
5. GPU , TPU Colab
6. Ensemble + Spell Correction Thresholds
, , , CTCLoss, . . , . N "" . , , . . , , , ., +- .
Other Backbones. (EfficientNet, [SE, ECA]ResNet[xt], Mobilenet ), Resnet34.
Augmentations. Albumentations (Brightness, Gamma, Blur ), , .
TTA (Test-Time Augmentations). , holdout , public test - . , holdout.
Classic Blending. , , , , , .
(). , ! :)
, , . , .
P.S. ( , public):
| CER: 2.531 | WER: 13.5 | ACC: 62.107 | TIME: 32s |
submission .
P.P.S.
, ? :)