I am working on a project where it is required to create multiple instance of container dynamically based on the count received from the AWS Lambda function. Each container will execute its own task. I have done a lot of research but still not sure how to achieve this. Also how to delete the container instance when the task execution is completed?
You're describing the use case AWS Batch has been built for. It essentially allows you to submit tasks that are being processed in Docker Containers and manages the lifecycle of those containers for you. Since pre:invent 2020 it also supports Fargate.
An alternative would be using a Step Function that processes the output of the Lambda function and dynamically creates ECS tasks for that. Tasks without a service, so they just terminate when they're done processing. Depending on the amount of jobs you have I'd prefer AWS Batch.
Related
I have a Docker image that will execute different logic depending on the environment variables that it is run with. You can imagine that running it with VAR=A will produce slightly different logic compared to running it with VAR=B.
We have an application that is meant to allow users to kick off tasks that are defined within this Docker image. Depending on the user attributes, different environment variables will need to be passed into the Docker container when it is run. The task is meant to run each time an event is generated through user action and then the container should shut down/be removed.
I'm trying to determine if GCP has any container services that best match what I'm looking for. My understanding of some of the services is:
Cloud Functions - can work well for consuming events and taking specific actions each time an event is triggered, but it is not suited for containerized workloads.
Cloud Run - a serverless way of deploying containers. As I understand it, a deployment on cloud run spins up a "service", and the environment variables must be passed in as part of the service definition. Because we may have a large number of combinations of environment variables (many of which may need to be running at once), it seems that this would end up creating a large number of services, which feels potentially clunky. This approach seems better for deploying a single service with static environment variables that needs to be auto-scaled by GCP.
GKE - another container orchestration platform. This is what I'm considering at the moment. The idea is that we would define a single job definition that can vary according to environment variables that are passed into it. The problem is that these jobs would need to be kicked off dynamically via code. This is a fairly straightforward process with kubectl, but the Kubernetes REST API seems fairly underdeveloped (or at least not that well documented). And the lack of information online on how to start jobs on-demand through the Kubernetes API makes me question whether this is the best approach.
Are there any tools that I'm missing that would be useful in spinning up containers on-demand with dynamic sets of environment variables and removing them when done?
In docker we have used deploy: replicas: 3 for our microservice. We have some Cronjob & the problem is the system in running all cronjob is getting called 3 times which is not what we want. We want to run it only one time. Sample of cron in nest.js :
#Cron(CronExpression.EVERY_5_MINUTES)
async runBiEventProcessor() {
const calculationDate = new Date()
Logger.log(`Bi Event Processor started at ${calculationDate}`)
How can I run this cron only once without changing the replicas to 1?
This is quite a generic problem when cron or background job is part of the application having multiple instances running concurrently.
There are multiple ways to deal with this kind of scenario. Following are some of the workaround if you don't have a concrete solution:
Create a separate service only for the background processing and ensure only one instance is running at a time.
Expose the cron job as an API and trigger the API to start background processing. In this scenario, the load balancer will hand over the request to only one instance. This approach will ensure that only one instance will handle the job. You will still need an external entity to hit the API, which can be in-house or third-party.
Use repeatable jobs feature from Bull Queue or any other tool or library that provides similar features.
Bull will hand over the job to any active processor. That way, it ensures the job is processed only once by only one active processor.
Nest.js has wrapper for the same. Read more about the Bull queue repeatable job here.
Implement a custom locking mechanism
It is not difficult as it sounds. Many other schedulers in other frameworks work on similar principles to handle concurrency.
If you are using RDBMS, make use of transactions and locking. Create cron records in the database. Acquire the lock as soon as the first cron enters and processes. Other concurrent jobs will either fail or timeout as they will not be able to acquire the lock. But you will need to handle a few cases in this approach to make it bug-free and flawless.
If you are using MongoDB or any similar database that supports TTL (Time-to-live) setting and unique index. Insert the document in the database where one of the fields from the document has unique constraints that ensure another job will not be able to insert one more document as it will fail due to database-level unique constraints. Also, ensure TTL(Time-to-live index) on the document; this way document will be deleted after a configured time.
These are workaround if you don't have any other concrete options.
There are quite some options here on how you could solve this, but I would suggest to create a NestJS microservice (or plain nodeJS) to run only the cronjob and store it in a shared db for example to store the result in Redis.
Your microservice that runs the cronjob does not expose anything, it only starts your cronjob:
const app = await NestFactory.create(
WorkerModule,
);
await app.init();
Your WorkerModule imports the scheduler and configures the scheduler there. The result of the cronjob you can write to a shared db like Redis.
Now you can still use 3 replica's but prevent registering cron jobs in all replica's.
I have a docker-compose consisting of four containers, all of which perform a single function:
An nginx proxy that forwards UI and API requests to the corresponding containers (node container, flask container), as depicted in the image below.
There is also a separate container which executes long running python scripts and works independent of the other containers. I'd now like to create the ability to execute scripts in the "long running scripts" (LRS) container via the API:
What is the best way to do this?
I've seen a few other questions that are somewhat similar to this, but raise more questions than they answer. Amongst the suggestions I've seen are:
Pass docker.sock into the API container; from the API container, exec into LRS and execute the intended script
Doesn't this create serious security vulnerabilities?
Doesn't this require that docker be installed on the API container in order to exec, violating the separation of concerns principle of docker?
HTTP listener on the LRS container, listening for commands from the API in order to execute the script on LRS
Again, doesn't this violate separation of concerns, since I'll now essentially need a light weight API in the LRS container to listen to actions from the principal API?
None of this solutions seem ideal. Am I missing something? How do I achieve the intended functionality?
Generally the solution to run long-running scripts has been a pub-sub model. Your API would drop a message onto an execution Message-Queue. The worker instance would subscribe to that queue, and when messages appear, would execute your long-running script/query/etc. When the execution is complete, either a message will go back on a different queue, or results will be placed in a predetermined location (url).
This has a couple of advantages:
The two solutions are effectively isolated from each other
You can scale out the LRS (worker) solution if you need more capacity by adding additional workers
if the LRS instance goes down the API will not depend on it being up. Work will be queued for when an instance becomes available.
I would like to run a process as daemon on AWS Infrastructure that is responsible to read the AWS SQS Queue and make some process.
My first approach is to use a docker container deployed on ECS Container service. So I will be on while true loop, sleeping for some seconds. Using this, I can control the sleep time between processing, so If my SQS queue is full, I could decrease the sleep time. So
I know that is possible to use AWS Lambda scheduled as a cron job, but I have no control over the cron time (decrease or increase in response of sqs size).
The AWS Lambda approach is simpler and there is no need of "any" infrastructure, but it less flexible.
Does anyone know another approach?
Looking at the way AWS Lambda handles cron scheduling under the hood (see https://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html and http://docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html) you should be able to find the Cloudwatch event that is triggering your lambda and modify it from within the lambda itself. Documentation for the programmatic APIs for Lambda and Cloudwatch Events is fairly scarce, though, so you'll have to figure a lot out for yourself.
All in all, doesn't really sound like an easier approach than just running your own container, although if you have a lot of these sorts of things to do it might not hurt to package it all in a library.
We have Ruby on Rails application running on EC2 and enabled autoscaling feature. We have been using whenever to manage cron. So new instances with an image of the main instance created automatically on spikes and dropped when low traffic. But this also copies cron jobs as well to newly created instances.
We have a specific requirement where we want to limit cron to a single instance.
I found a gem which looks like handing this specific requirement but I am skeptical about it, reason being it is for elastic beanstalk and no longer maintained.
as a workaround, you can have a condition within the cron specifying that the cron job should execute based on a condition that would elect a single instance among your autoscaling group. e.g have only the oldest instance running the cron, or only the instance having the "lowest" instance ID, or whatever you like as a condition.
you can achieve such a thing by having your instances calling the AWS API.
As a more proper solution, you maybe could use a single cronified lambda accessing your instances? this is now possible as per this page
Best is to set scale in protection. It prevents your instance being terminated during scaling events.
You can find more information here on AWS https://aws.amazon.com/blogs/aws/new-instance-protection-for-auto-scaling/