Deep Learning: How Does It Work? Part 1

In this article, you will learn





-What is the essence of deep learning -What are



activation functions for



-What is FCNN



-What tasks can FCNN solve -What



are the disadvantages of FCNN and how to deal with them




Small introduction



This is the beginning of a series of articles about what tasks are in DL, networks, architectures, principles of work, how certain tasks are solved and why one is better than the other.



What preliminary skills are needed to understand everything? It's difficult to say, but if you know how to google or ask questions correctly, then I'm sure my series of articles will help you understand a lot.



What's the point of deep learning?



The bottom line is to build some algorithm that would take X as input and predict Y. If we write Euclid's algorithm for finding GCD, then we just write cycles, conditions, assignments and that's all - we know how to build such an algorithm. And how to build an algorithm that takes an image as input and says a dog or a cat? Or nothing at all? And the algorithm, at the input of which we submit the text and want to know - what genre is it? It's so easy to write cycles and conditions with pens - here neural networks, deep learning and all these buzzwords come to the rescue.



More formally and a little about activation functions



Formally speaking, we want to build a function from a function from a function ... from the input parameter X and the weights of our network W, which would give us some result. It is important to note here that we cannot simply take many linear functions, since superposition of linear functions - linear function. Then any deep network is analogous to a network with two layers (input and output). Why do we need nonlinearity? Our parameters, which we want to learn how to predict, may non-linearly depend on the input data. Non-linearity is achieved by using different activation functions on each layer.



Fully-connected neural networks (FCNN)



Just a fully connected neural network. It looks something like this:



image



The bottom line is that each neuron of one layer is connected with each neuron of the next and the previous one (if any).



The first layer is the entrance. For example, if we want to feed a 256x256x3 image to the input of such a network, then we need exactly 256x256x3 neurons in the input layer (each neuron will receive 1 component (R, G or B) of a pixel). If we want to show the height of a person, his weight and 23 more features, then we need 25 neurons in the input layer. The number of neurons in the output is the number of features that we want to predict. It can be either 1 feature or all 100. In the general case, by the output layer of the network, one can almost certainly say what problem it solves.



Each connection between neurons is a weight that is trained by the backpropagation algorithm, which I wrote about here .



What tasks can FCNN solve?



-Regression problem . For example, predicting the value of a store based on some input criteria such as country, city, street, traffic, etc.



-Problem of classification . For example, the classic is MNIST classification.



-I will not undertake to say about the task of segmentation and object detection using FCNN. Maybe someone will share in the comments :)



Disadvantages of FCNN



  1. The neurons of one layer do not have "common" information (all weights in the network are unique).
  2. A huge number of trained parameters (weights), if we want to train the network in photographs.


What to do about these disadvantages? Convolutional Neural Networks (CNN) is all right. This is what my next article will be about.



Conclusion



I don't see much point in dwelling on fully connected neural networks for a very long time. If anyone is interested in the very implementation of such networks, then here you can see and read about my implementation.



All Articles