Spreadsheet Data Not Refreshing Prior to Running Google Script - google-sheets

I have a spreadsheet which imports stock prices from google finance & other sources, then calculates port folio value.
There is also script which saves daily valuation data.
This has been running well for nearly 2 years, but since early May, it seems to be saving the same data every day, like it's not refreshing the stock prices.
Of course, if I open it manually and run the script, it all works OK.
If I don't open the sheet, the script now saves unrefreshed stock prices. What's the best way to force a refresh ?

You can use trigger function in your app script, where trigger to run your script function daily, hourly and it is one of the most powerful feature in Apps script, far better and easily to set-up compared to RPA.

Related

Scheduling a Cognos report based on Dates from a data warehouse table

We have a report already written for Student Services, but we need to schedule it for specific times in the term; these times are from the date table in our data warehouse. For example, we need it on the first day of the term (one of the MANY dates defined in our date table), and two weeks prior to the first day of the term. If the current date is either one of these dates, we need the report to run; otherwise no. Should I use trigger-based Cognos reporting? Is there a way to do it in regular Cognos scheduling? Should I schedule it out of an external (Oracle) stored procedure?
We were able to set up Event Studio to first have a daily check to see if it is 14 days before term (had to add that to the date table), and 2 weeks after start of term (also in our date table). Set up the run condition, set up tasks for the reports required, then set up the email. Could not set up Run Agent in Event studio (IBM was singularly unhelpful here) so we scheduled it in Cognos. It runs like a charm.

Lead time for changes

I am working on some project where i need to generate lead time for changes per application, per day..
Is there any prometheus metric that provides lead time for changes ? and How we integrate it into a grafana dashboard?
There is not going to be a metric or dashboard out of the box for this, the way I would approach this problem is:
You will need to instrument your deployment code with the prometheus client library of your choice. The deployment code will need to grab the commit time, assuming you are using git, you can use git log filtered to the folder that your application is in.
Now that you have the commit date, you can do a date diff between that and the current time (after the app has been deployed to PRD) to get the lead time of X seconds.
To get it into prometheus, use the node_exporter (or windows_exporter) and their textfile collectors to read textfiles that your deployment code writes and surface them for prometheus to scrape. Most of the client libraries have logic to help you write these files, and even if there is not, the format of the textfiles is pretty easy to use by writing the files directly.
You will want to surface this as a gauge metric, and have a label to indicate which application was deployed. The end result will be a single metric that you can query from grafana or set up alerts that will work for any application/folder that you deploy. To mimic the dashboard that you linked to, I am pretty sure you will want to use the over_time functions.
I also want to note that it might be easier for you to store the deployment/lead time in a sql database/something other than prometheus and use that as a data source into grafana. For applications that do not deploy frequently you would easily run into missing series when querying by using prometheus as a datastore, and the overhead of setting up the node_exporters and the logic to manage the textfiles might outweigh the benefits if you can just INSERT into a sql table.

BigQuery API Connection in Sheets (Potential Caching)

My team is currently running an Flask API Script from the Google Cloud Console that updates a BigQuery table every 15 minutes.
We are using sheets to pull the results from that table and and display it with visual tools from sheets.
This full sheets ecosystem looks like this:
A sheets document has a linked project that has a JS-like file that
has been saved as a shared library.
All other sheets in the ecosystem that are used for displaying tables use the above mentioned shared library to work with BigQuery.
The shared library facilitates the connection to BigQuery and the uploading of its static formatted response to a static location
in the sheet, which will then update the visuals of the sheet.
Outside the shared library each sheet has its own timezone, BigQuery dataset name, and BigQuery table name defined as a variable.
The shared library uses these variables to grab the correct BigQuery tables information.
All of the sheets have been checked for BigQuery connection API turned on, and verified they are linked to the same Cloud Console
project where the BigQuery dataset is held.
This is the issue I am facing:
For a few weeks all was working as expected in this ecosystem. Then one day inexplicably the BigQuery table, which updates every minute, completely stopped updating. I first went to check the Cloud Console server where my Flask project is running (outside of sheets) to update the BigQuery source table. The Flask server was on time and the BigQuery data was up to date.
The data tracks metrics through out the day, so I can tell the BigQuery table is updated if it has data for the current hour. So if its 4:36 PM I can expect the table to have data for around 4:15 PM at the very least. In this case the BigQuery table is actually up to date, but the sheet was not up to date (4-5 hours behind).
This prompted me to check the BigQuery connection in sheets and log out the results. This is where things got very weird. I was getting the snap shot of the BigQuery table from 4 hours ago. As I said before I manually checked the table and the actual BigQuery table was up to date.
I fixed this error by copy and pasting my exact code into a new function and re-naming it. When I ran the new function, which had identical code, but a different function name, the table no longer pulled an old "cached" version of the BigQuery table but instead pulled the correct values.
I can only imagine that there is some sort of caching going on in the native sheets-BigQuery api integration. I am also confused by the fact that I have 9 of these environments running in parallel and only 2 of them were effected. They are all running exactly the same code and all have been verified to be pulling from correctly updated BigQuery tables but only 2 randomly fell behind pulling the same table for 4 hours. Now this completely looks like a caching problem but I have no way to actually test as I have no insight into the BigQuery API library that is native to sheets.
What I would like know is can I somehow add a cache buster or something to the BigQuery request to be sure this doesn't happen?
I have the table checking for updates on cron every 1 minute to make sure it updates as often as possible as it is a realtime visual update application. Would lowering the cron help with this cacheing issue?

Creating a structured Jenkins Failing Test Report

The situation right now:
Every Monday morning I manually check Jenkins jobs jUnit results that ran over the weekend, using Project Health plugin I can filter on the timeboxed runs. I then copy paste this table into Excel and go over each test case's output log to see what failed and note down the failure cause. Every weekend has another tab in Excel. All this makes tracability a nightmare and causes time consuming manual labor.
What I am looking for (and hoping that already exists to some degree):
A database that stores all failed tests for all jobs I specify. It parses the output log of a failed test case and based on some regex applies a 'tag' e.g. 'Audio' if a test regarding audio is failing. Since everything is in a database I could make or use a frontend that can apply filters at will.
For example, if I want to see all tests regarding audio failing over the weekend (over multiple jobs and multiple runs) I could run a query that returns all entries with the Audio tag.
I'm OK with manually tagging failed tests and the cause, as well as writing my own frontend, is there a way (Jenkins API perhaps?) to grab the failed tests (jUnit format and Jenkins plugin) and create such a system myself if it does not exist?
A good question. Unfortunately, it is very difficult in Jenkins to get such "meta statistics" that spans several jobs. There is no existing solution for that.
Basically, I see two options for getting what you want:
Post-processing Jenkins-internal data to get the statistics that you need.
Feeding a database on-the-fly with build execution data.
The first option basically means automating the tasks that you do manually right now.
you can use external scripting (Python, Perl,...) to process Jenkins-internal data (via REST or CLI APIs, or directly reading on-disk data)
or you run Groovy scripts internally (which will be faster and more powerful)
It's the most direct way to go. However, depending on the statistics that you need and depending on your requirements regarding data persistance , you may want to go for...
The second option: more flexible and completely decoupled from Jenkins' internal data storage. You could implement it by
introducing a Groovy post-build step for all your jobs
that script parses job results and puts data of interest in a custom, external database
Statistics you'd get from querying that database.
Typically, you'd start with the first option. Once requirements grow, you'd slowly migrate to the second one (e.g., by collecting internal data via explicit post-processing scripts, putting that into a database, and then running queries on it). You'll want to cut this migration phase as short as possible, as it eventually requires the effort of implementing both options.
You may want to have a look at couchdb-statistics. It is far from a perfect fit, but at least seems to do partially what you want to achieve.

What is the Cloud Dataflow equivalent of BigQuery's table decorators?

We have a large table in BigQuery where the data is streaming in. Each night, we want to run Cloud Dataflow pipeline which processes the last 24 hours of data.
In BigQuery, it's possible to do this using a 'Table Decorator', and specifying the range we want i.e. 24 hours.
Is the same functionality somehow possible in Dataflow when reading from a BQ table?
We've had a look at the 'Windows' documentation for Dataflow, but we can't quite figure if that's what we need. We came up with up with this so far (we want the last 24 hours of data using FixedWindows), but it still tries to read the whole table:
pipeline.apply(BigQueryIO.Read
.named("events-read-from-BQ")
.from("projectid:datasetid.events"))
.apply(Window.<TableRow>into(FixedWindows.of(Duration.standardHours(24))))
.apply(ParDo.of(denormalizationParDo)
.named("events-denormalize")
.withSideInputs(getSideInputs()))
.apply(BigQueryIO.Write
.named("events-write-to-BQ")
.to("projectid:datasetid.events")
.withSchema(getBigQueryTableSchema())
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE) .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED));
Are we on the right track?
Thank you for your question.
At this time, BigQueryIO.Read expects table information in "project:dataset:table" format, so specifying decorators would not work.
Until support for this is in place, you can try the following approaches:
Run a batch stage which extracts the whole bigquery and filters out unnecessary data and process that data. If the table is really big, you may want to fork the data into a separate table if the amount of data read is significantly smaller than the total amount of data.
Use streaming dataflow. For example, you may publish the data onto Pubsub, and create a streaming pipeline with a 24hr window. The streaming pipeline runs continuously, but provides sliding windows vs. daily windows.
Hope this helps

Resources