Multiple colab sessions working in the same google sheet

Multiple colab sessions working in the same google sheet - google-sheets

I'm using Google Colab and Google sheets to automate some stuff. Would the process be faster if I split the script into parts that can run in multiple colab sessions (e.g. session 1 deals with steps 1-100, while session 2 deals with steps 101-200 in parallel) or can the sheets interface only handle one inquiry at a time?

Tested this for myself. Results were:
1 session did 1,15 iterations per minute
3 sessions did 1,7 (0,48 + 0,65 + 0,57) iterations per minute
I have noticed that performance depends on some limitations from Google that varies, so I cannot know for sure if it is correct until I've done more tests. First test looks promising though.

Related

Spreadsheet Data Not Refreshing Prior to Running Google Script

I have a spreadsheet which imports stock prices from google finance & other sources, then calculates port folio value.
There is also script which saves daily valuation data.
This has been running well for nearly 2 years, but since early May, it seems to be saving the same data every day, like it's not refreshing the stock prices.
Of course, if I open it manually and run the script, it all works OK.
If I don't open the sheet, the script now saves unrefreshed stock prices. What's the best way to force a refresh ?

You can use trigger function in your app script, where trigger to run your script function daily, hourly and it is one of the most powerful feature in Apps script, far better and easily to set-up compared to RPA.

Youtube Data API reduced to Zero Queries per day (Audit / Form)

i have a project on Google Cloud Platform, using the Youtube Data V3 API. Everything was going well, earlier this month after receiving several emails alerting that I had to do an Audit, the queries for the day stopped completely. they were completely zeroed.
I followed the link to perform the Audit, and i successfully completed all the changes that were told to me in my application, strictly following the regulations. The audit went well. No further changes were required from me.
But the issue is that the queries per day remain at zero. I can't edit. It occurred to me that maybe using the Google Cloud Trial there could be some change. Negative. I'm still unable to increase the limits, not even using the balance they give you as a gift.
The project used approximately a margin of about 25,000 to 300,000 queries / day. I have requested 500.000 queries / day filling in the quota expansion form to have a little more margin.
Meanwhile the project has been stopped for almost a month. If anyone knows something or how I should proceed about it,
Thank you very much.
Have a nice day,

How to calculate application availability (SLA)

I have standard ASP.NET MVC project and I need to calculate application availability to find out our SLA level. So, I need to get something like this for our web application.
Information from my hosting provider
System Availability: 99.9860%
Total Uptime: 30d 10h:22m:44s
Total Downtime: 0d 0h:6m:9s
Total Reboots: 3
Mean Time Between Reboots: 10.15 days
But I need to calculate availability for application. So, the question is
How to calculate ASP.NET MVC application availability in proper way?
Maybe someone has already implemented that, or any suggestion how to do that, any help will be appreciated.
Where to start?
The first point what I think that is Application Insights and availability test. The problem is that the minimum value of test frequency is 5 minutes. I need more precise measurements.
Next, create a some tool that will call my app every second and collect information. Result: a very large number of requests.
Also, get some perf counters from IIS or something like that. Need to investigate if it is possible.
I know that the question possible is too broad, but I didn't find any info about implementation of application availability. What do you think about that?

It would take to long if I was to explain all parts that can be done, so I'll keep it short.
Usually you define all these details in a Service Level Agreement where you also define the availability target (i.e. 99 %) that also include planned downtime. A 99 % availability target is to have the app running and its functionality as described in the document with at most approx. 87.6 h per year. Here is a SLA uptime calculator.
The normal interval is 5 minutes as you say, but it you can prove by using an external site / service that the suppliers are not meeting the requirements, you calculate your loss (revenue loss, labor costs etc) and claim the money from them. You already have a Business Impact Analysis (BIA) I guess otherwise you should do it.
Ok, now to the programming / DevOps part. I usually develop applications / services with this in mind and report its status to a third party service like NewRelic, Uptrends or similar. As an example I also use a self-made service for this because accurate requirements for delivering data at least once a second with a hard deadline. In my solution I use WebSockets to send data in both directions following a schedule, event or when needed. A benefit with that is that you can send status (good or bad) let say every 500 ms and you will know within one second if the app has failed (≈ 499 ms + 500 ms).
Using a service like this you can measure the uptime, custom events of interest and possible errors within a second and a ton of other metrics. Usually within 5-100 ms but WCET/WCRT is hard to estimate.
To answer your question, you cannot calculate application availability with so few measure points, once every 5 min is covering approx. 12 seconds per hour and you cannot have any reliable calculation from that. You can assume everything was ok between the measure points but that is called guessing. I have made implementations that have 14 400 measure points per hour in order to provide 500 ms accuracy (Banks).
I hope you got an answer that helps you with your problem.

Django or Ruby-On-Rails max users on one server deployment and implcations of GIL

I'm aware of the hugely trafficked sites built in Django or Ruby On Rails. I'm considering one of these frameworks for an application that will be deployed on ONE box and used internally by a company. I'm a noob and I'm wondering how may concurrent users I can support with a response time of under 2 seconds.
Example box spec: Core i5, 8Gb Ram 2.3Ghz. Apache webserver. Postgres DB.
App overview: Simple CRUD operations. Small models of 10-20 fields probably <1K data per record. Relational database schema, around 20 tables.
Example usage scenario: 100 users making a CRUD request every 5 seconds (=20 requests per second). At the same time 2 users uploading a video and one background search process running to identify potentially related data entries.
1) Would a video upload process run outside the GIL once an initial request set it off uploading?
2) For the above system built in Django with the full stack deployed on the box described above, with the usage figures above, should I expect response times <2s? If so what would you estimate my maximum hits per second could be for response time <2s?
3) For the the same scenario with Ruby On Rails, could I expect response times of <2s?
4) Specifically with regards to response times at the above usage levels, would it be significantly better built in Java (play framework) due to JVM support for concurrent processing.
Many thanks
Duncan

What is the Cloud Dataflow equivalent of BigQuery's table decorators?

We have a large table in BigQuery where the data is streaming in. Each night, we want to run Cloud Dataflow pipeline which processes the last 24 hours of data.
In BigQuery, it's possible to do this using a 'Table Decorator', and specifying the range we want i.e. 24 hours.
Is the same functionality somehow possible in Dataflow when reading from a BQ table?
We've had a look at the 'Windows' documentation for Dataflow, but we can't quite figure if that's what we need. We came up with up with this so far (we want the last 24 hours of data using FixedWindows), but it still tries to read the whole table:
pipeline.apply(BigQueryIO.Read
.named("events-read-from-BQ")
.from("projectid:datasetid.events"))
.apply(Window.<TableRow>into(FixedWindows.of(Duration.standardHours(24))))
.apply(ParDo.of(denormalizationParDo)
.named("events-denormalize")
.withSideInputs(getSideInputs()))
.apply(BigQueryIO.Write
.named("events-write-to-BQ")
.to("projectid:datasetid.events")
.withSchema(getBigQueryTableSchema())
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE) .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED));
Are we on the right track?

Thank you for your question.
At this time, BigQueryIO.Read expects table information in "project:dataset:table" format, so specifying decorators would not work.
Until support for this is in place, you can try the following approaches:
Run a batch stage which extracts the whole bigquery and filters out unnecessary data and process that data. If the table is really big, you may want to fork the data into a separate table if the amount of data read is significantly smaller than the total amount of data.
Use streaming dataflow. For example, you may publish the data onto Pubsub, and create a streaming pipeline with a 24hr window. The streaming pipeline runs continuously, but provides sliding windows vs. daily windows.
Hope this helps

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart