How to upload data to Google BigQuery

The translation of the article was prepared on the eve of the start of the course "Non-relational databases" .








In this article, we will look at options for uploading data to the Google BigQuery cloud storage. This includes easy ways to load data from CSV / JSON files and ways to load via API or extension.



With Google BigQuery (GBQ), you can collect data from various sources and analyze it using SQL queries. Among the advantages of GBQ are high computation speed even for large amounts of data and low cost.



Why do you need to upload data to a single repository? If you want to use end-to-end analytics, generate reports from raw data and measure the effectiveness of your marketing, then you need Google BigQuery.



If you need to analyze terabytes of data in seconds, Google BigQuery is the easiest and most affordable choice. You can learn more about this service by watching a short video on the Google Developers YouTube channel .



Creating a dataset and table



Before uploading any data, you first need to create a dataset and table in Google BigQuery. To do this, on the BigQuery home page, select the resource in which you want to create a dataset.





The images used in the article are provided by the author.



Specify the dataset ID in the "Create dataset" window, select the data processing location and set the default storage period for the table.

Note: If you choose "Never" as the expiration date for the table, no physical storage will be defined. For temporary tables, you can specify the number of days to keep them.







Then create a table in the dataset.







Done! Now you can start downloading data.



Loading data using Google Sheets (OWOX BI BigQuery Reports extension).



If you need to upload data from Google Sheets to Google BigQuery, the easiest way to do this is to install the free OWOX BI BigQuery Reports extension.



You can install this extension directly from Google Sheets or from the Chrome Web Store .







After installing it, a dialog box appears with prompts and asking for permissions.







Now it's time to get back to Google Sheets. To upload data to BigQuery, simply select Upload data to BigQuery from the Add-ons menu -> OWOX BI BigQuery Reports.







Specify the project, dataset, and table name to load the data into. And that's all :) An



indisputable advantage of the OWOX BI BigQuery Reports extension is its ease of use. You can also use the extension to customize scheduled reports.



To generate reports based on accurate raw data from all sources and automatically upload them to the Google BigQuery repository, we recommend using the OWOX BI Pipeline service .



With Pipeline, you can set up automatic data collection from advertising services, call tracking systems and CRM. This allows you to quickly and easily retrieve complete datasets from the sources of your choice.







Just select your data sources and allow access; leave the rest to OWOX BI .



With OWOX BI you can create reports for any taste and color, from ROI, ROPO effect and cohort analysis to LTV and RFM analysis.



Loading data from CSV files



To upload data from a CSV file, select a data source in the “Create table” window and use the “Upload” option.







Then select the file and its format.







Next, you need to define the destination for the data by specifying the project name and dataset.



Note: In Google BigQuery, you can select two types of tables: native and external.









Google BigQuery will automatically detect the structure of the table, but if you want to add fields manually, you can use either the text edit function or the + Add field button.



Note: If you want to interfere with the parsing of data from a CSV file into Google BigQuery, you can use advanced options.







For more information on the CSV format, see the detailed documentation from the Internet Society.



Loading data from JSON files



To load data from a JSON file, repeat all the steps above: create or select a dataset and table that you are working with - only choose JSON as the file format.

You can upload a JSON file from your computer, Google Cloud Storage, or Google Drive.







Note: For more information on the JSON format, see the Google Cloud documentation.



Downloading data from Google Cloud Storage.



Google Cloud Storage allows you to securely store and transfer data online.



Useful information about using this service:



Getting Started with Google Cloud Storage Cloud Storage

Documentation

Quick Start Guides

Choosing Storage and Database on Google Cloud Platform



You can upload files from Google Cloud Storage to Google BigQuery in the following formats:



  • CSV
  • JSON (newline delimited)
  • Avro
  • Parquet
  • ORC
  • Cloud Datastore








You can read more about using Cloud Storage with big data in the official documentation .



You can also learn about Cloud Storage upload limits and permissions in the Google Cloud Help Center.



Loading data from other Google services such as Google Ads and Google Ad Manager.



To download data from various Google services, you first need to set up the BigQuery Data Transfer Service. Before you can use it, you must select or create a data project and, in most cases, enable billing for it. For example, billing is required for the following services:



  • Campaign Manager
  • Google ad manager
  • Google Ads
  • Google Play (beta)
  • YouTube - Channel Reports
  • YouTube - Content Owner Reports




Note: For more information on setting up and changing billing, see the Google Cloud Help Center.



To launch BigQuery Data Transfer Service, on the BigQuery home page, select Transfers from the menu on the left.







Note: You need administrator rights to create a Transfer.



In the next window, all you have to do is select the desired data source.







Note: BigQuery Data Transfer Service can be accessed not only from the platform console, but also from:

  • lassic bq_ui
  • bq command-line tool
  • BigQuery Data Transfer Service API




Once set up, the service will automatically and regularly upload data to BigQuery. However, you cannot use it to download data from BigQuery.



Loading data using the API



With Cloud Client Libraries, you can use your favorite programming language to work with the Google BigQuery API.



Note: More information on loading data using the API can be found in the Google Cloud documentation .



First, you need to create or select a project with which you will work. Then on the main page go to the API section.







In the API overview window, you can connect APIs and services. You need to select the API you need from the library.







In the library, you can search by field or filter the API by category.







You can use a set of Python scripts from OWOX BI to automate the import of data into Google BigQuery.

There are scripts for automating the import of data into Google BigQuery from the following sources:



  • amoCRM
  • FTP
  • FTPS
  • HTTP (S)
  • Intercom
  • ExpertSender
  • MySQL
  • SFTP




These Python scripts can be downloaded from GitHub .



Note: Learn how to use Python with the Google API in this video tutorial from Google Developers on YouTube .



conclusions



In this article, we've covered the most popular ways to load data into Google BigQuery. From simple uploading of a data file to uploading data via API, any user can find a suitable option.






All Articles