I want to create external table in BigQuery and the data source is from Google Sheet. Is it possible to do it using dbt? In the yml file, where should I put the URI?
The main problem is, I don’t have the access to create it directly in BigQuery.
One way to handle a Google Sheet as a source is by creating a new table out of it in BigQuery via Connected Sheets.
Then, you create a new source in dbt that relies on that table, and start building your downstream models from there.
As far as I know, you cannot create a source directly from dbt, unless it is a seed file, which I woul not recommend unless it is a rather static file (e.g. country names and ISO codes, which is not prone to change over time).
We have a similar situation where the data source is from Google Sheet.
The end user updates the Google sheet on a periodical basic and we replicate it using Fivetran to our Snowflake datastore.
DBT can then pick up the data seamlessly.
Related
I have several tables in BigQuery that are sourced from Google Sheets tables. When the Google Sheets table is updated then automatically the table in BigQuery is also updated. I am trying to understand how the log of this event looks like in the Operations Logging. My end idea is to create a sink of theses logs in order to create a Pub/Sub and run scheduled queries based on these events.
Thank you
When you use external Table (Google sheet or other) the data are never stored in BigQuery native storage. It's always external.
Therefore, when you update your Google Sheet, nothing happens in BigQuery. It's only when you query the data, you will read (again) the sheet document and get the latest data.
Therefore, there is no insert log that you can track when you update the data in Google Sheet. The only log that you have is when you perform a request in BigQuery to read the data (external or not), as mentioned by Sakshi.
When the external data source(Google Sheet or other) is updated and the BigQuery table associated with it is queried, BigQuery initiates an insert job which is visible in Cloud Logging.
You can find this log by applying filter resource type as BigQuery Project in Cloud Logging console, ie. you will see protoPayload.methodName set to google.cloud.bigquery.v2.JobService.InsertJob.
For more information on BigQuery Logs you can refer to this documentation.
Using this https://github.com/GoogleCloudPlatform/DataflowTemplates for CDC for MySQL and publishing to google Pub/Sub topic.
In the properties file, there is a provision for whitelistedTables= where you have to give a comma separate list of all the tables you want to monitor for change.
Is there any straightforward way to whitelist an entire database and in turn all tables in it?
Unfortunately the whitelisetedTables parameter does not allow for whitelisting all tables for a given database. However, Dataflow templates are customizable. You can download the code from Github then re-upload your modified version to GCS. Then, you can run your new templated job that allows for this feature. See this prior question: How to Customize GCP Dataflow template?. The code for the Dataflow templates live here: https://github.com/GoogleCloudPlatform/DataflowTemplates.
I have created several systems with Google Forms (and linked sheets) to log services provided and timekeeping. I would like to share these systems with other people to use as a template for their own data. Is there a way to easily do this keeping my formula's intact?
Successfully: I have found a way to share the form only as a template by copying the URL into an emailed hyperlink changing the ending from edit to copy.
Cumbersome but ok Migrant Service Log: This method does not seem to work entirely for spreadsheets. It still asks me to give them access to the original document. I can set access on the original to view only and limit the time to one day.
Unsuccessful Clock In/Out: The new "copy" of the spreadsheet is not automatically linked with new "copy" of the Forms so it does not update when a new response is added. I must link it in form. This becomes more of an issue with my sheets that have formulas based on these responses. It is now necessary for each new user to manually link and rename the sheets to make them function correctly.
Clock In/Out System (attendance purposes)
Clock In Form
Clock Out Form
MSA Sheet
Attendance Office Sheet
Migrant Service Log (team communication purposes)
Migrant Service Log Form
Migrant Service Log Sheet
I would like for them to all be user-friendly and easily shared while keeping everything confidential to the user.
if you want to keep your formulas as a secret you can set up the 2nd spreadsheet and use IMPORTRANGE formula to get data over and then just simply link the 2nd spreadsheet somewhere on the end of the form.
I would like to create a pupil progress sheet for each of my pupils that is held in my gdrive and is owned by me (Teacher).This spreadsheet would use import range to pull test data from the 'Class sheet'. The pupil would be able to view their file but not make changes, or make a copy of the file so that they can make changes!).
This I can do manually but doing this for a class of 30 (let alone the 10 classes I have!) would be tedious in the extreme. I would imagine that a script might be able to automate much of this and I was wondering if there are any showstoppers in the list of requirements below before investigating further.
Is it possible to create a script that does the following from a MASTER SHEET (below):
A B C D
1 ID NAME EMAIL
2 1 sample sample#gmail.com
3 2 sample2 sample2#gmail.com
1) Run through the list above and create a duplicate of a separate template Google Sheets file for each person in Column C.
2) Rename the sheet to their name using column C
3) Populate a single cell (A1) within the new spreadsheets with their ID (Column B)
4) Share with the email (Column D) allowing VIEWING only and disabling copying the file etc.
Yes, you can create a "Master" document, only readable and not editable by anyone else but you. That document can be accessible to all your students after you share it with a "public link" to view (not to edit).
With the help of Google's SpreadSheet, you can create a file with different questions for instance, and run a script to generate different types of tests, but this might require a little of knowledge of Javascript. You can learn how to make Google drive scripts from tutorials.
For me, the creation of automatic named files for every student, and send emails automatically is not a great idea, but yes you can do it and there are tutorials on how to send emails from a Spreadsheet.
The best option is to just give a link to your students.
Preferably, after you create a sharing link from Google Drive, you should make the link as short as possible, by using a free service like the site bit.ly where any huge and hard to remember old link, can be shortened to something more easy to use like bit.ly/TomTests_Oct2016.
You can also consider to use Khan Academy if it fits your needs, since you are a teacher. This is definitely one good platform to evaluate students progress, with a lot of free content (if not everything).
In my program I have multiple databases. One is fixed and cannot be changed, but there are also some others, the so called user databases.
I thought now I have to start for every database one connection and to connect to each data dictionary. How is it possible to connect to more than one database with one connection by handing over the data dictionary filename? Btw. I am using a local server.
thank you very much,
André
P.S.: Okay I might find the answer to my problem.
The Key word is CreateDDLink. The procedure is connecting to another data dictionary, but before a master dictionary has to be set.
Links may be what you are looking for as you indicated in the question. You can use the API or SQL to create a permanent link alias, or you can dynamically create links on the fly.
I would recomend reviewing this specific help file page: Using Tables from Multiple Data Dictionaries
for a permanent alias (using SQL) look at sp_createlink. You can either create the link to authenticate the current user or set up the link to authenticate as a specific user. Then use the link name in your SQL statements.
select * from linkname.tablename
Or dynamically you can use the following which will authenticate the current user:
select * from "..\dir\otherdd.add".table1
However, links are only available to SQL. If you want to use the table directly (i.e. via a TAdsTable component) you will need to create views. See KB 080519-2034. The KB mentions you can't post updates if the SQL statement for the view results in a static cursor, but you can get around that by creating triggers on the view.