AWS ECS restrict to only one container instance

AWS ECS restrict to only one container instance - docker

I want to only ever run one instance of my container to run at a given time. For instance say my ECS container is writing data to an EFS. The application I am running is built in such a way that multiple instances can not be writing to this data at a given time. So is there a way to make sure that ECS never starts more than one instance. I was worried that when one container was being torn down or stood up that two containers may end up running simultaneously.
I was also thinking about falling back to EC2 so that I could meet this requirement but did want to try this out in ECS.
I have tried setting the desired instances to 1 but I am worried that this will not work.

Just set min, desired and maximum number of tasks to 1.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-configure-auto-scaling.html#:~:text=To%20configure%20basic%20Service%20Auto%20Scaling%20parameters

Related

On Demand Container Serving With GCP

I have a Docker image that will execute different logic depending on the environment variables that it is run with. You can imagine that running it with VAR=A will produce slightly different logic compared to running it with VAR=B.
We have an application that is meant to allow users to kick off tasks that are defined within this Docker image. Depending on the user attributes, different environment variables will need to be passed into the Docker container when it is run. The task is meant to run each time an event is generated through user action and then the container should shut down/be removed.
I'm trying to determine if GCP has any container services that best match what I'm looking for. My understanding of some of the services is:
Cloud Functions - can work well for consuming events and taking specific actions each time an event is triggered, but it is not suited for containerized workloads.
Cloud Run - a serverless way of deploying containers. As I understand it, a deployment on cloud run spins up a "service", and the environment variables must be passed in as part of the service definition. Because we may have a large number of combinations of environment variables (many of which may need to be running at once), it seems that this would end up creating a large number of services, which feels potentially clunky. This approach seems better for deploying a single service with static environment variables that needs to be auto-scaled by GCP.
GKE - another container orchestration platform. This is what I'm considering at the moment. The idea is that we would define a single job definition that can vary according to environment variables that are passed into it. The problem is that these jobs would need to be kicked off dynamically via code. This is a fairly straightforward process with kubectl, but the Kubernetes REST API seems fairly underdeveloped (or at least not that well documented). And the lack of information online on how to start jobs on-demand through the Kubernetes API makes me question whether this is the best approach.
Are there any tools that I'm missing that would be useful in spinning up containers on-demand with dynamic sets of environment variables and removing them when done?

How to create leases to avoid duplicate cron-jobs when deploying application across multiple instances?

I have a Dockerized Django application which have a number of CRON-jobs that need to be executed.
Right now I'm running it with the package Supercronic (which is recommended for running cron-jobs inside containers). This will be deployed on a two servers for redunancy-purposes, i.e. If one goes down the other one need to take over and execute the cron-jobs.
However, the issue is that without any configuration this will result in duplicate cron-jobs being executed, one for each server. I've read that you can set up something called a "lease" for the cron-jobs to retrieve, to avoid duplicates from different servers, but I haven't found any instructions on how to set this up.
Can someone maybe point me in the right direction here?

If you are running Supercron in two different instance, Supercron doesn't know about whether the job gets triggered, Its up to the application to handle the consistency.
You can do it in many ways either controlling the state with File or DB entries or any better way where your docker application can check the status before it start executing the actual process.

How does Openwhisk decide how many runtime containers to create?

I am working on a project that is using Openwhisk. I have created a Kubernetes cluster on Google cloud with 5 nodes and installed OW on it. My serverless function is written in Java. It does some processing based on arguments I pass to it. The processing can last up to 30 seconds and I invoke the function multiple times during these 30 seconds which means I want to have a greater number of runtime containers(pods) created without having to wait for the previous invocation to finish. Ideally, there should be a container for each invocation until the resources are finished.
Now, what happens is that when I start invoking the function, the first container is created, and then after few seconds, another one to serve the first two invocation. From that point on, I continue invoking the function (no more than 5 simultaneous invocation) but no containers are started. Then, after some time, a third container is created and sometimes, but rarely, a fourth one, but only after long time. What is even weirded is that the containers are all started on a single cluster node or sometimes on two nodes (always the same two nodes). The other nodes are not used. I have set up the cluster carefully. Each node is labeled as invoker. I have tried experimenting with memory assigned to each container, max number of containers, I have increased the max number of invocations I can have per minute but despite all this, I haven't been able to increase the number of containers created. Additionally, I have tried with different machines used for the cluster (different number of cores and memory) but it was in vain.
Since Openwhisk is still relatively a young project, I don't get enough information from the official documentation unfortunately. Can someone explain how does Openwhisk decide when to start a new container? What parameters can I change in values.yaml such that I achieve greater number of containers?

The reason why very few containers were created is the fact that worker nodes do not have Docker Java runtime image and that it needs be downloaded on each of the nodes the first this environment is requested. This image weights a few hundred MBs and it needs time to be downloaded (a couple of seconds in google cluster). I don't know why Openwhisk controller decided to wait for already created pods to be available instead of downloading the image on other nodes. Anyway, once I downloaded the image manually on each of the nodes, using the same application with the same request rate, a new pod was created for each request that could not be served with an existing pod.

The OpenWhisk scheduler implements several heuristics to map an invocation to a container. This post by Markus Thömmes https://medium.com/openwhisk/squeezing-the-milliseconds-how-to-make-serverless-platforms-blazing-fast-aea0e9951bd0 explains how container reuse and caching work and may be applicable for what you are seeing.
When you inspect the activation record for the invokes in your experiment, check the annotations on the activation record to determine if the request was "warm" or "cold". Warm means container was reused for a new invoke. Cold means a container was freshly allocated to service the invoke.
See this document https://github.com/apache/openwhisk/blob/master/docs/annotations.md#annotations-specific-to-activations which explains the meaning of waitTime and initTime. When the latter is decorating the annotation, the activation was "cold" meaning a fresh container was allocated.
It's possible your activation rate is not fast enough to trigger new container allocations. That is, the scheduler decided to allocate your request to an invoker where the previous invoke finished and could accept the new request. Without more details about the arrival rate or think time, it is not possible to answer your question more precisely.
Lastly, OpenWhisk is a mature serverless function platform. It has been in production since 2016 as IBM Cloud Functions, and now powers multiple public serverless offerings including Adobe I/O Runtime and Naver Lambda service among others.

How to run multiple times a Docker container with different parameters in Kubernetes?

I have a Docker container that reads a variable which I provide to it during the execution. Then I though that I would like to run many of those and pass a different value as variable for each one of those. Then, I just created a simple text file which contains all the values I want to pass into (they are about 20k different ones) and I am using gnu-parallel to spawn multiple Dockers in parallel.
My question is how I could do something like that in a Kubernetes environment?

Sounds like you want you want to do can be achieved using kubernetes jobs.
I would advise against using gnu parallel on kubernetes unless you can fit all the jobs in one node. If this is the case I think it's ok, just set the cpu request in the job template.

Single cron job across multiple AWS EC2 images

We have Ruby on Rails application running on EC2 and enabled autoscaling feature. We have been using whenever to manage cron. So new instances with an image of the main instance created automatically on spikes and dropped when low traffic. But this also copies cron jobs as well to newly created instances.
We have a specific requirement where we want to limit cron to a single instance.
I found a gem which looks like handing this specific requirement but I am skeptical about it, reason being it is for elastic beanstalk and no longer maintained.

as a workaround, you can have a condition within the cron specifying that the cron job should execute based on a condition that would elect a single instance among your autoscaling group. e.g have only the oldest instance running the cron, or only the instance having the "lowest" instance ID, or whatever you like as a condition.
you can achieve such a thing by having your instances calling the AWS API.
As a more proper solution, you maybe could use a single cronified lambda accessing your instances? this is now possible as per this page

Best is to set scale in protection. It prevents your instance being terminated during scaling events.
You can find more information here on AWS https://aws.amazon.com/blogs/aws/new-instance-protection-for-auto-scaling/

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart