👩‍🎨 ☝🏼 👷 Recipe for training neural networks 🚴🏾 👊🏼 👨🏽‍⚕️

Translation of the article A Recipe for Training Neural Networks on behalf of the author (Andrej Karpathy). With some additional links.

A version in Ukrainian is also available in a personal blog: Recipe for navchannya neural mesh .

« », . ( ). , « » « ».

, , , . , , , ( ). , , , , . , .

1)

. 30- , () , . :

>>> your_data = #    
>>> model = SuperCrossValidator(SuperDuper.fit, your_data, ResNet50, SGDOptimizer)
#

- API . , requests:

>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200

! , URL, GET / POST , HTTP .., . , . , . "" , ImageNet. " " ("Yes you should understand backprop"), " ", , , . " " + " » . . "" . , " " , . , , , , . …

2)

. . 3 . . . . , - .

, . , , ( ). " " , ( ) -. , . ( ) , , . , , , , . , , , . , . , , , .. , ; , .

, ( ) " " . , , , , , . , , - .

, , , , . , . , , , , , - . , - "" , , . ( ), , .

1. C c

- , . . ( ), , . , . , . / . . , . - , ? ? ? ( average pool)? ? ?

, / , () , . , , , - .

, - / / (, , , ..) . - .

2. / + ( )

, , ASPP FPN ResNet ? . . - + . - , - - . , , (, ), ( ) .

. , . .
. . , . - , , - .
. () . , Tensorboard. .
. , . , , -log(1 / n_classes)

softmax . L2, .
. . , , 50, 50. 1:10, , 0.1 . " ", .
. , , (, ). () . , , .
. (, ). , , , . ? ?
. (, ). (, ) , (, ). , , , , . , - , .
. , , . . ?
. y_hat = model (x)

( sess.run

Tensorflow). - , , - . " ". , .
. . "" , . , «» , - , . .
. , . , , , (, view

, transpose / permute

) . , , , , . ( ) - , i, i- . , , t 1..t-1. , , , .
. , , , , , . , , , , , . , , .

3.

, + . () , . , ( ), ( ). .

, , : , , ( ), ( , ). , , , , - , .

. , . , : . , - , . . , . , , , ResNet-50 . - .
Adam ( ) . Adam 3e-4

. , Adam , . (SGD) Adam, . (. , Adam. , .)
. , , , . . - , , ..
. - , . , - - , . , ImageNet 10 30- . ImageNet ( ), , , . , , . ( ) .

4.

, , , . , . :

. -, . , , . , . - ( ), ~ 5- .
. - .
. , - . ; , , , , ( ) GAN.
. - , , .
( ). ( ). , 2008 [ ], , , ( NLP, , BERT , / ).
. , . - ( ), . , , .
. , . , ImageNet, (average pooling), .
. . , / / , "" .
. . dropout2d ( ) . / , , , .
. ( ).
. , , ( , ).
. , , , , , "" , .

, , , , , . , - . .

5.

"" , , . :

. , , , . , , . , a , b , a , .
. , , , :). .

6.

, , :

. - 2% -. , , .
. , , , , . , . , , SOTA (state of the art - " ").

Once you get here, you have all the ingredients for success: you deeply understand the technology, dataset and problem, you built the entire learning / assessment infrastructure and achieved high confidence in its accuracy, you explored increasingly complex models, gaining performance improvements in ways, which you have foreseen at every turn. You are now ready to read a lot of papers, try a lot of experiments, and get your SOTA results. Good luck!

Recipe for training neural networks

1)

2)

1. C c

2. / + ( )

3.

4.

5.

6.

More articles: