Neural networks and trading. Part 2: DIY kit

In the last article, I described how I managed to get a trend prediction from the neural network in the real market. The article aroused interest, but it turned out that the question "Where is the evidence?" No answer. Indeed, the topic of neural networks in trading is discussed a lot, there are publications, branches on professional forums are devoted to it. But no matter how much you immerse yourself in the topic, no matter how much you communicate with experts, the impression remains that all this is some kind of elusive illusion. There is nothing really working, nothing that, even remotely, but really could connect the neural network and the price movement forecast. Hence the well-grounded opinion of the community that the price movement cannot be predicted in principle, and all this talk is about nothing.



I propose to dispel these doubts once and for all and to transfer the discourse from the area of ​​“can predict or cannot” to the area of ​​“predicts good or bad”. And we will do it in a simple, quick and visual way. I will give a ready-made tool and everyone can get the result on their computer. The free project GoogleColaboratory will help us with this. This is an open platform for collaborative development, all calculations take place on Google servers, all interaction is through the browser, registration is not required.



The code for our work is open and already loaded in GoogleColab. The results of training a neural network for each will be individual. This is because the starting weights are distributed randomly and the results are slightly different. Also, keep in mind that the history of quotes is a very noisy data, so the quality of training is low, but sufficient to see how the neural network will predict. The forecast should be approximately at the level of a good indicator.



The only place where we will shorten our path is collecting data on the history of a trading pair. The collection is carried out by an application for MetaTrader5, the process is not complicated, but requires skills in working with a tester in MetaTrader5. Detailed instructions will lead to a separate article, therefore, we use pre-prepared data for the Euro / Dollar pair (for those who use MT5, the link to Expert is at the end of the article). It will be possible to make sure that the data prepared in advance does not "peep ahead" and do not suggest neural networks at the last stage, when we move on to testing the real market.



Let's start ...



GoogleColaboratory



Our "laptop" in GoogleColab can be found at this link . Don't forget to log into your Google (or Gmail) account first.



Copy the "laptop" to yourself.



image



Now you need to sequentially run all the blocks from top to bottom.



1. Installing libraries



This step will install TensorFlow and other libraries. The process will end by itself, nothing needs to be done.



image



2. Loading and preparing data for training



At this stage, the dataset will be loaded, as well as separate data sets for training and testing will be prepared. The dataset was collected for the EURUSD pair for the period from the beginning of 2015 to the present day, the data collection step is the M6 ​​candlestick. Last 2 weeks - test site. The data in the dataset is a set of hundreds of thousands of lines, each of which is something like this



0.32,0.26,0.00,0.43 ... 0.66,0.25,0.24,0.05,0,1,1600144440,1.189240


Predictors are separated by commas, fields 3 and 4 from the end - this is the correct answer where the trend went (0.1 - down; 1.0 - up). The second field from the end is the candlestick id. The last is the price at the open of the candle. The last two fields are not used for training.



3. Training and testing the model



Leave the default neural network settings on first launch. The training will take place in five passes until an acceptable result is obtained. In case of successful training, something like this table will appear:



+------------+---------+----------+-------------+------------+
|   |  |  |  (%) |  (%) 
+------------+---------+----------+-------------+------------+
|     0      |   7174  |   7173   |      50     |   100.0    |
|     2      |   6956  |   6731   |      50     |    95.4    |
|     4      |   6430  |   6224   |      50     |    88.2    |
|     6      |   5867  |   5630   |      51     |    80.1    |
|     8      |   5250  |   5065   |      50     |    71.9    |
|     10     |   4636  |   4450   |      51     |    63.3    |
|     12     |   3964  |   3772   |      51     |    53.9    |
|     14     |   3330  |   3152   |      51     |    45.2    |
|     16     |   2758  |   2539   |      52     |    36.9    |
|     18     |   2198  |   2012   |      52     |    29.3    |
|     20     |   1700  |   1544   |      52     |    22.6    |
|     22     |   1298  |   1167   |      52     |    17.2    |
|     24     |   958   |   825    |      53     |    12.4    |
|     26     |   699   |   517    |      57     |    8.5     |
|     28     |   446   |   278    |      61     |    5.0     |
|     30     |   246   |   127    |      65     |    2.6     |
+------------+---------+----------+-------------+------------+


The neural network's response is a binary classification where [0 1] is "down" and [1 0] is "up". But the neural network never responds with an integer value, its response, depending on the degree of "confidence", can be of the type [0.4 0.6]. In this response, the neural network believes that the price will go down, but is not very confident, and in the response [0.1 0.9] it is also down, but there is much more confidence. This is what the array of real responses looks like:



[[0.5084921  0.49150783]
 [0.3930727  0.6069273 ]
 [0.4930727  0.50692725]
 ...
 [0.5189831  0.48101687]
 [0.27955987 0.7204401 ]
 [0.476914   0.5230861 ]]


The “Network response” table field is the difference within this binary response multiplied by 100. Obviously, this difference characterizes the “confidence” of the network in its forecast. As a result, after multiplying by 100, we have values ​​in the range from 0 to 100. Now you can take not all answers, but select only those in which the neural network has significant "confidence". To understand how much the answer affects the result of the forecast, the test section is checked for the correctness of the forecast at different levels of this "confidence". Each row of the table is a check for a new higher value of the Network Response. The higher the “Network Response” filter, the fewer responses, but the better they are. This can be seen in the "Won" and "Lose" fields. The process stops when the responses (Signals) are less than 1% of all test data.



If the network has not been trained in one pass, simply restart this block (you do not need to reload the data).



4. Results on the trading chart



Run this block. Everything is obvious here, on the chart of the trading pair from the test set, neural network signals are drawn, green up, red down.



image



5. Testing in the real market



During this check, data for the neural network is loaded, which is created as new candles are added in real time. Those. the last portion of the received data was created at the opening of the candlestick, in our case, the zero candlestick M6. Naturally, these data do not contain the correct answer; the network is invited to make a real forecast. You can make sure that the data is not changed as it moves into history by uncommenting the print (data) line and comparing the values ​​of a particular line upon entry and after a while.



def get_from_ennro(symbol, tfm, dim, lim):
    ...
    # print(data)
    ...


There may not be any signals in the real market. This happens when the volatility is less than in the test area, in which case the neural network does not see entry points.



conclusions



Yes! The quality of the forecast is not suitable for opening positions. But we did not set such a task, the main thing is that the neural network learns and recognizes something on the chart, guesses the trend, its forecast is obviously not chaotic. Please note that we used the simplest neural network configuration - Sequential Dense with 2 layers and only 10 epochs for training. There is room for further development.



There are already solutions that qualitatively improve the forecast, but about them in the next article.



PS For those who want to collect and prepare data for any pair in MetaTrader5 on their own, read here and use GoogleColab, which is given here .



All Articles