Getting data from Google Sheets to BigQuery - google-sheets

I need to pull data from a Google Sheet into BigQuery. So far I have created a Table in BQ using the CSV option and all of the data imported fine. But now I need to automatically update the BQ table with the data in the Google Sheet's tab. Can anyone point me in the right direction? Thanks in advance.

With BigQuery you can directly query your data from the Google Sheet by creating an external table from the console.
Given your data is properly formatted, you just have to "create table", from "Drive", provide your Google Sheet uri, fill-in some additional settings and that's it !
Any changes in the spreadsheet will be immediately accessible in BigQuery as well.
Documentation for reference.

I looked at BQ when they released and quickly dismissed it as it is free for a short period and then we have to pay. I have my own domain/website and it comes with MySQL, FTP, and so much more.
Anyway, I have many projects that go the opposite direction, pull the data down from the SQL DB to sheets, but which direction does not really matter.
Google sheets has Triggers, I have triggers set to launch at certain times, usually every day, you get to set up the trigger to fire off a function you specify.
https://developers.google.com/apps-script/reference/script/clock-trigger-builder
You can also set triggers via scripts
/**
* Creates time-driven triggers
*
* https://developers.google.com/apps-script/reference/script/clock-trigger-builder
*/
function createTimeDrivenTriggers() {
// Trigger every day at 04:00AM CT.
ScriptApp.newTrigger('csvDaily')
.timeBased()
.everyDays(1)
.atHour(4)
.create();
}

Related

Google drive sheets importrange

I am looking for a way to import the data from one google drive sheet to another using ImportRange formula. However, I want the data to be synced once per day at a certain time instead of automatically updating as the formulas seems to do. Any help would really be appreciated
Formula used:
={IMPORTRANGE(B2,"sheet1!$A$1");IMPORTRANGE(B3,"sheet1!$A$1");IMPORTRANGE(B4,"sheet1!$A$1"); IMPORTRANGE(B5,"sheet1!$A$1");
IMPORTRANGE(B6,"sheet1!$A$1")
}
You can create a Script (Google Apps Script) that copies the data automatically with Time-driven Triggers (https://developers.google.com/apps-script/guides/triggers).
function copyData() {
// Gets data
var data = SpreadsheetApp.openById("ID1").getSheetByName("SheetName").getRange("A1:B2").getValues();
// Copies data
SpreadsheetApp.openById("ID2").getSheetByName("SheetName").getRange("A1:B2").setValues(data);
}

Google Sheets IMPORTRANGE and QUERY not refreshing

I have been using IMPORTRANGE and QUERY extensively to connect all of my spreadsheets for a while now. But recently noticed that IMPORTRANGE and QUERY will not return proper data unless the source Sheet is open. Also, the data used to automatically update (every 30 min or so, whatever the default refresh rate is) in the background for IMPORTRANGE, but now it will only update if I manually open the Sheet and it will display "Loading...." before returning the data.
Is anybody else having issues with these two functions?
This answer explains the issue you encountered.
In summary:
It doesn't update when the sheet isn't opened.
Recalculation only happens when sheet is opened.
Functions that pull data from outside the spreadsheet recalculate at the following times:
ImportRange: 30 minutes
ImportHtml, ImportFeed, ImportData, ImportXml: 1 hour
GoogleFinance: may be delayed up to 20 minutes
Alternative solution:
You can use time driven triggers and update those values every N minutes/hours instead BUT you will have to create a script for that.
Everytime you trigger, you'd have to use setFormula on every cell you used your importrange and query.
References:
setFormula
Choose how often formulas calculate
Time Driven Triggers

Google Sheets API: Append cells and add developer metadata atomically?

I want to append a row of cells to a google sheet and also attach some developer metadata to that row.
In the Google Sheets v4 API, I know you can use batchUpdate to append a row with the appendCells request, and you can add developer metadata using the createDeveloperMetadata request.
My issue is that I wanna set some developer metadata to specifically the newly appended cells atomically. There's not really a way to specifically ensure the range of the newly added row in createDeveloperMetadata, and if I use two different requests, someone else may insert a row between those requests which could shift all the rows, causing the appended cell's range to be pointing to an incorrect row.
Is there a way to attach developer metadata to a newly added cell atomically?
Answer:
There is not currently way of ensuring that the sheet structure hasn't changed between requests.
More Information:
Your The best option, I think, is to make two sequential requests in the same batch. Though this isn't foolproof, in very unlucky circumstances. Even inserting the row directly using an UpdateCellsRequest isn't foolproof either, as simply knowing which row you inserted the data doesn't exclude the possibility that someone else may insert/delete a row before it between the two requests.
Feature Request:
You can however let Google know that this is a feature that is important for the Sheets API and that you would like to request they implement it. Google's Issue Tracker is a place for developers to report issues and make feature requests for their development services.
The page to file a Feature Request for the Google Sheets API is here.
References:
Requests - UpdateCellsRequest | Sheets API | Google Developers
Requests - AppendCellsRequest | Sheets API | Google Developers
The solution I figured out was essentially:
Fetch the dimensions of the sheet
Perform a batch update with insertDimension, updateCells, and createDeveloperMetadata all performed for the same sheet dimension index at the end of the sheet
This basically ensures that the dimension index will point to the same row for all 3 operations in the batch update, and if the index points out of bounds, all 3 operations will fail

Any way to cache Importrange and make it update only once a week?

I currently use =ImportRange to bring in some data from a public sheet I do not own.
However, it is a lot of data and importrange tries to refresh the data on every page access, and doesn't cache it locally.
Is there a script I can run instead to get the data via importrange, then hard code it into my own sheet and only update it once a week?
You can use the Google Sheets add-on called sheetgo, to automatically update your reference from another sheet. You can use 30 updates for free every month or pay for more updates
But I found this to be effective.
function refreshSheet(){
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheet_name);
//Your operations on the sheet
Utilities.sleep(1000);
}
while(true){
Utilities.sleep(604800000);
refreshSheet();}

Is there a Google Sheets formula expression that only evaluates when the sheet is edited?

The title sums it up pretty well. I need a cell of my document to reflect the last time any cell in the document updated. "=now()" doesn't work, because now() is evaluated any time the sheet is evaluated - even if there is no change. This means that even the simple act of hitting the browser's reload button causes now() to update its cell - TWICE!
I'd rather not use the onEdit trigger in script, because that would require that I add that script to 1000+ sheets. I already have a way to edit document through the python sheets API, so it would be fairly easy to automatically add the expression wherever I need it.
Volatile spreadsheet functions are not suitable for timestamps, for the reason you stated.
Use a trigger. You don't need to add a script to each sheet. A single stand-alone Google Apps Script can automatically install an "on edit" trigger for multiple spreadsheets, using forSpreadsheet(key) method. Example:
var ids = ['ss_id1', 'ss_id2', ... ]; // array of spreadsheet IDs
for (var i = 0; i < ids.length; i++) {
ScriptApp.newTrigger('timestamp').forSpreadsheet(ids[i]).onEdit().create();
}
function timestamp(e) {
e.source.getActiveSheet().getRange(1, 1).setValue(new Date());
}
Now, whenever one of the listed spreadsheets is edited, the cell A1 of the sheet that was edited in it will have the time of the edit.
This probably won't work for 1000 triggers, because of current limitations of 20 triggers per user per script. Looks like you'll need 50 copies of the stand-alone script, which is a stretch but still manageable.
Another alternative is to run a scheduled Google Apps Script that retrieves "last modified" dates of spreadsheets in the directory (using DriveApp) and edits those in the spreadsheets.
Yet another is to give up. Maybe you don't need a cell to hold information that is already available in spreadsheet interface, "last edited".

Resources