we currently have a DBT instance that sits over our Google BigQuery data warehouse. Now we've recently been asked to incorporate some data from Google Sheets into our modelling.
With that, is it possible for DBT to connect directly with Google Sheets? i.e. configure Google Sheets as a direct external datasource in the .yml file, or have DBT possibly run some sort of BigQuery federated SQL statement?
There's a DBT package called dbt-external-tables (https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/), but that only seems to work with BigQuery + files in Google Cloud Storage buckets.
But the common and most straightforward option I'm seeing in forums and documentation is to create an external table on BigQuery on top of the Google Sheet. And then have DBT connect to the external BigQuery table.
Just wanted to check if the above common option for integrating DBT x Google Sheets x BigQuery is in fact the only option, or if there's actually a way to have DBT connect directly to Google Sheets before hitting BigQuery?
Thanks
From what I see over on the dbt-external-tables side, the bigquery adapter folds to a DDL statement for the create_external_table macro.
Unfortunately, I just don't see a similar DDL statement available for the Google Sheets "external" definition. It looks like the UI probably executes something through the bq cli client to create through from the web portal, if I had to guess.
If a section is ever added to this guide which includes a DDL definition for Google Drive based external sources, this would probably become a relatively easy build into the previously mentioned dbt macro for external tables. Until then, you will have to define this through the UI, the bq client yourself, or the REST api.
Related
I have Google Sheet which is schedule to run report daily and I have to create Dashboard in AWS QuickSight but I cant find any way to connect Google Sheet as a Data Source in QuickSight.
My google sheet's report is being populated from Google Analytics.
So, Is there any solution for this?
As per aws documentation, it seems we cannot directly integrate google sheets data as a datasource. The data either needs to be present in one of the Database or S3 or any relational data stores of aws.
We might need to make one of the following mechanisms :
Get the data in one of the relational database in aws through a script/cron and then pass it on to QuickSight for analysis.
Get the file in S3 through a script/cron and then pass it on to QuickSight for analysis.
Reference : https://docs.aws.amazon.com/quicksight/latest/user/supported-data-sources.html
You can use a third party product to fill the gap:
https://skyvia.com/data-integration/analyze-google-sheets-with-quicksight
Or any of the other connector that link Analytics with Quicksight, there are plenty such as this:
https://www.dataddo.com/integration/google-analytics/amazon-aurora
Is two-way communication between BigQuery and Google Sheets possible?
In other words, if add a row or modify an entry in Google Sheets it reflects in the corresponding table in BigQuery, and vice versa (no schema changes).
You can have one-way connections in either direction, but not two-way from both.
From Google Sheets to Google BigQuery:
You can define a Sheets file as an external data source in BigQuery. This way, any updates to the sheet will be reflected back in any queries from BigQuery.
Setting this up from the command-line:
Authenticate with Google Drive scopes:
gcloud auth login --enable-gdrive-access
Get the Drive URI of your sheet.
Create the external table definition file:
bq mkdef \
--noautodetect \
--source_format=source_format \
"drive_uri" \
path_to_schema_file > /tmp/mytable_def.json
Modify the file with a text editor with any addition options.
Create the external table to query.
bq mk --external_table_definition=/tmp/mytable_def.json mydataset.mytable
Modifying an external table from BigQuery is not supported.
From BigQuery to Sheets:
Use Connected Sheets to visualize BigQuery data with Google Sheets.
Updating BigQuery data via connected sheets is not supported.
You can add rows to a google sheet and see the results reflected in BQ when you query. Additionally you can add columns to the sheet, and make the appropriate schema changes in BQ and see the resulting values.
You cannot though run DML from BigQuery that would result in additional rows being added to a google sheet.
I have a database table running which references a table created from a Google sheet. Sometimes this query runs whereas other times I see the errors
Error while reading data, error message: Failed to read the spreadsheet. Errors: Deadline=118.95772243s
How I can make querying from this table more reliable?
As noted in the documentation for external data sources, there can be some latency with Google Sheets since the data is not persisted to BigQuery, which is why the query runs inconsistently. As noted in the same documentation, it is recommended to perform the following options:
1) Store the Google Sheets CSV within Google Cloud Storage.
2) Load the data into BigQuery
Note: Loading data into BigQuery from Google Drive is currently not supported at this time, therefore step ‘1’ is required above for loading Google Sheets data into BigQuery..
In addition you may have the above steps scheduled using Google Cloud Composer. Here is an example of how to use Cloud Composer to transfer from Google Cloud Storage into a BigQuery table.
I want to connect a Google Sheets to a new BigQuery table that populates and updates the data automatically from Sheets to BigQuery. I'm using this tutorial from Google itself to do the setup.
My problem: the table connected with spreadsheet was created empty so I had to query it and save the result as another table to see and use the data.
I can't post images yet so I ask you please to check this imgur post, please.
I'm not expert in these things but does not seems to be the best way to do it. I found some spreadsheet add-ons but I'm trying to avoid them.
Any ideas what's the best way to do this kind of setup/connection?
I had to configure each column manually
BigQuery provides a variety of tools which make it pretty simple to connect the external table to BigQuery.
One option is to simply use the WebUi and the Auto Detect option which help you not to enter each column manually
This works perfectly for me also when inserting and adding data to the external table.
You can refer to BigQuery official manual on an external table for more help
I want to visualize the data present in big query by a iOS app just like tableau . Any suggestions are welcomed . I have visualized the data present in big query , in tableau and i want to know are there any other visualization tools cause tableau is paid where as i want to visualize the data for free and i want to implement the functionality in iOS app. Please help me on this ..........
have done some r&d and from there i know the data can be visualized by using Google chart any help on that
My favorite new open source dashboard for BigQuery is re:dash, check it out:
Code: https://github.com/EverythingMe/redash
Demo: http://demo.redash.io/
If you have GCE (Google Compute Engine) you can run your own private instance:
Instructions: https://github.com/EverythingMe/redash/wiki/Setting-up-re:dash-instance
Currently: gcutil addimage redash-040b563 gs://redash-images/redash-040b563.tar.gz
If you're willing to write a little bit of code, there is a sample appengine app here that runs BigQuery queries and saves the results in a dashboard.
Another good option is to use Apps Script to write queries and chart the results in Google Sheets. Step by step instructions are in the book Google BigQuery Analytics, but you may be able to just read the relevant snippet here or the blog post here. Sample Apps Script code is here.
You can build reports in Google Sheets using a free BigQuery Reports Add-on.
Benefits:
No CSV-files or coding required,
Queries are saved for future use,
Analysts and developers can create shared SQL-queries with pre-defined variables,
Variables allow modifying the result without editing SQL-syntax,
Everything works in Google Cloud.