How to keep lambda warm using zappa python - serverless

I have an API function in python and deployed it on aws lambda using zappa. When I am hitting my API after 15 minutes time its taking atleast 5 to 10 seconds to respond (which is too long for my API) for first request. I have came to know about cold start issue in aws lambda. How to keep lambda warm using zappa python?

Zappa has a default warmer that keeps invoking the lambda to avoid cold starts - check https://github.com/Miserlou/Zappa#advanced-usage (make sure keep_warm is set to true).
You can verify that there is a CloudWatch event rule of a scheduled event.

Related

Properly handle timeout on CloudRun

We use Google Cloud Run to wrap an analysis developed in R behind a web API. For this, we have a small Fastify app that launches an R script and uploads the results to Google Cloud Storage. The process' stdout and stderr are written to a file and are also uploaded at the end of the analysis.
However, we sometimes run into issues when a process takes longer to execute than expected. In these cases, we fail to upload anything and it's difficult to debug, because stdout and stderr are "lost" on the instance. The only thing we see in the Cloud Run logs is this message
The request has been terminated because it has reached the maximum request timeout
Is there a recommended way to handle a request timeout?
In App Engine there used to be a descriptive error: DeadllineExceededError for Python and DeadlineExceededException for Java.
We currently evaluate the following approach
Explicitly set Cloud Run's request timeout
Provide the same value as an environment variable, so it's available to the container
When receiving a request, we start a timer that calls a "cleanup" function just before the timeout is exceeded
The cleanup function stops the running analysis and uploads the current stdout and stderr files to Cloud Storage
This feels a little complicated so any feedback very appreciated.
Since the default timeout is 5 minutes and can extend up to 60 minutes, I would simply start by increasing this to 10 minutes. Then observe over the course of a month how that affects your service.
Aside from that fix, I would start investigating why your process is taking longer than expected and if it's perhaps due to a forever-growing result set.
If there's no result set scalability concern, then bumping the default timeout up from 5-minutes seems to be the most reasonable and simple fix. It would only be a problem until your script has to deal with more data in the future for some reason.

aws lambda and api gateway timeout only in lambda

I uploaded some functional codes to lambda.
Before I upload my code, I checked that my code works pretty good in my local server runs on Django Rest Framework. and on collab environment, It also works.
But in lambda, some cases that need more calculating time bring me timeout.
I know api gate way has 29 seconds time limit. But in my local server, even more complicated cases are done within 10 seconds.
I know lambda has cold start problem but it takes much more time than runned in my local server. I want to know why and is there any solution?

How can I see how long my Cloud Run deployed revision took to spin up?

I deployed a Vue.js and a Kotlin server app. Cloud Run does promise to put a service to sleep if no request to it arise for a specific time. I did not opened my app for a day now. As I opened it - it was available almost immediatly. Since I know how long it takes to spin up when started locally I kinda don't trust the promise that Cloud Run really had put the app to sleep and span it up so crazy fast.
I'd love to know a way how I can really see how long it took for the spinup - also for startup improvement for the backend service.
After having the service inactive for some time, record the time when you request the service URL and request it.
Then go to the logs for the Cloud Run service, and use this filter to see the logs for the service:
resource.type="cloud_run_revision"
resource.labels.service_name="$SERVICE_NAME"
Look for the log entry with the normal app output after your request, check its time and compare it with the recorded time.
You can't know when the instance will be evicted or if it is kept in memory. It could happen quickly, or take hours or days before eviction. it's "serverless".
About the starting time, when I test, I deploy a new revision and I have a try on it. In the logging service, the first log entry of the new revision provides me the cold start duration. (Usually 300+ ms, compare to usual 20 - 50 ms with warm start).
The billing instance time is the sum of all the containers running times. A container is considered as "running" when it process request(s).

cloud run is closing the container even if my script is still running

I want to run a long-running job on cloud run. this task may execute more than 30 minutes and it mostly sends out API requests.
cloud run stops executing after about 20 minutes and from the metrics, it looks like it did not identify that my task is still in the running state. so it probably thinks it is in idling and closing the container. I guess I can run calls to the server while job run to keep the container alive, but is there a way to signal from to container to cloud run that job is still active and not to close the container?
I can tell it is closing the container since the logs just stop. and then, the next call I make to the cloud run endpoint, I can see the "listening" log again from the NodeJS express.
I want to run a long-running job on cloud run.
This is a red herring.
On Cloud Run, there’s no guarantee that the same container will be used. It’s a best effort.
While you don’t process requests, your CPU will be throttled to nearly 0, so what you’re trying to do right now (running a background task and trying to keep container alive by sending it requests) is not a great idea. Most likely your app model is not fit a for Cloud Run, I recommend other compute products that would let you run long-running processes as well.
According to the documentation, Cloud Run will time out after 15 minutes, and that limit can't be increased. Therefore, Cloud Run is not a very good solution for long running tasks. If you have work that needs to run for a long amount of time, consider delegating the work to Compute Engine or some other product that doesn't have time limits.
Yes, You can use.You can create an timer that call your own api after 5 minutes, so no timeout after 15 minutes.Whenever timer executes it will create a dummy request on your server.
Other option you can increase request timeout of container to 1 hour from 5 min, if your backend request gets complete in 1 hour

Run process as Daemon on AWS Infrastructure

I would like to run a process as daemon on AWS Infrastructure that is responsible to read the AWS SQS Queue and make some process.
My first approach is to use a docker container deployed on ECS Container service. So I will be on while true loop, sleeping for some seconds. Using this, I can control the sleep time between processing, so If my SQS queue is full, I could decrease the sleep time. So
I know that is possible to use AWS Lambda scheduled as a cron job, but I have no control over the cron time (decrease or increase in response of sqs size).
The AWS Lambda approach is simpler and there is no need of "any" infrastructure, but it less flexible.
Does anyone know another approach?
Looking at the way AWS Lambda handles cron scheduling under the hood (see https://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html and http://docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html) you should be able to find the Cloudwatch event that is triggering your lambda and modify it from within the lambda itself. Documentation for the programmatic APIs for Lambda and Cloudwatch Events is fairly scarce, though, so you'll have to figure a lot out for yourself.
All in all, doesn't really sound like an easier approach than just running your own container, although if you have a lot of these sorts of things to do it might not hurt to package it all in a library.

Resources