Visualization of Yandex Direct statistics with your own hands. From API to Data Studio

As an advertising professional, I need to constantly look after clients. Data Studio helps me with this.

However, advertising professionals rarely allocate analytics budgets, so you have to do it yourself.



What you need to do to visualize Yandex Direct in Data Studio:



  1. Get a token from your account (for this in the Yandex.Direct help)
  2. In Python Write a request to the Yandex Direct server
  3. Sum statistics in Pandas Data Frame
  4. Send data to Google Big Query
  5. Build a visualization in Data Studio based on data in Google Big Query






Write a request to the Yandex Direct server and add the data to the Pandas Data Frame





I used to tinker with requests and then edit the data so that it can be sent. You can try if you need fine-tuning (I wrote about this earlier here habr.com/ru/post/445734 )



For the rest, there is a much simpler way - my python package yadirstat. (a review on it habr.com/ru/post/512902 ). When using it, you enter a token, login, dates and you get a ready-made Data Frame, in which nothing needs to be changed.



The code looks like this:



from yadirstat import yadirstat
x=yadirstat.yadirstat.campaign('FFFFFfffffFFFFggggGGGgg', 'client123123','2020-05-10','2020-07-15')
print(x)




Send data to Google Big Query



This is the most obvious way to collect data for subsequent visualization in DataStudio, since they work perfectly in pairs.



For sending I use the package “pandas_gbq”

The code looks like this:

import pandas_gbq
pandas_gbq.to_gbq(x, 'YD_Days.test', project_id='red-abstraction-99999999',if_exists='replace', progress_bar=None)




Why is the data being overwritten? Because the statistics in the directive can be adjusted over time, and if we simply add new terms, we will have discrepancies in the statistics.



Now let's check if the information went to Big Query. If everything went well, there will be such a set of fields of their types







Build a visualization in Data Studio based on data in Google Big Query





To do this, you can immediately create a query in Big Query to get all the data:

press “Send a query to the table”, add “*” after SELECT and remove the limit. The query looks like this: “SELECT * FROM` red-abstraction-239999.YD_Days.test` ”



In Data Studio Connecting to Google Big Query









In changing the source, we will see the following fields





It is required to change the following fields for correct aggregation:



  • AvgCpc
  • ConversionRate
  • CostPerConversion
  • Ctr




Why is this done? Let's consider the CPC as an example

If we have two lines with the following data:

  • 100 clicks - Cost per click 100 rubles - Cost 10000
  • 2 clicks - Cost of a click - 10 rubles - Cost of 20 rubles


The usual average would say the CPC is on two lines 55

Therefore, to get the correct CPC, you should divide all costs by all clicks. In this case, the CPC will be 98.2.Just



hide these fields and add their analogs:



  • CPC = SUM (Cost) / SUM (Clicks)
  • CTR = SUM (Clicks) / SUM (Impressions) * 100
  • % conversions = SUM (Conversions) / SUM (Clicks) * 100
  • Conversion cost = SUM (Cost) / SUM (Conversions)




I don't use the “AvgPageviews” column at all.



Bounce rate is a very complex column due to the fact that Yandex uses different statistics based on some additional data.

In short, I use the formula, it is this that gives me the minimum deviations from what Yandex shows:

% bounce rate = SUM (Bounce_clicks) / (SUM (Clicks) / 100).

where Bounce_clicks is the number of bounced clicks in each of the lines

But this question is beyond the scope of this article :)




At this stage, we get the following set of fields:







We visualize



I make two pages for each client: general information and information on keys.



Let's start with the first page - general information

Here I post:



  • Schedule with expenses by day
  • Table with statistics by date
  • Table with statistics by campaigns
  • Dashboard with statistics for yesterday (clicks, cost, cost per click)




First, place a date range in the corner so that users can select dates for themselves:



Now, add a schedule with costs:



Set the following settings:



Get such a chart





For a table with dates, set the following settings:



For a table with campaigns, change the “Date” parameter to the name of campaigns



For dashboards I use a summary.As a result





, I get the following page with statistics:





We get statistics for keywords:





Everything is the same, only now the request will look like this:



import pandas_gbq

from yadirstat import yadirstat

x = yadirstat.yadirstat.criteria('AgAAAAI',client-12247235,'2020-05-10','2020-07-15')
print(x)

pandas_gbq.to_gbq(x, 'YD_criteria.test', project_id='red-abstraction-21239254613',if_exists='replace', progress_bar=None)



All Articles