Best way to run dockers parallelly - docker

I have a docker image that receives an argument and runs for roughly a minute.
From time to time I need to run it over a set of 10K-100K inputs.
I tried doing this using AWS Batch, but it ran very slowly, since at each given moment only few dockers ran.
Is there an easy alternative which allows configuring number of dockers to run simultaneously and thus controlling the over all run time?

As of December 2020, you can now run your docker containers on AWS Lambda:
https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/
With this release, the maximum time a Lambda can run has been increased to 15 minutes (up from 5 minutes).
Since you indicate your process only runs for roughly 1 minute, Lambda could be an option for you.

Related

Why does skaffold dev run much faster after computer restart?

I am developing an applicaction using microservices for the first time and have noticed that the time it takes the skaffold dev command to start up all my containers goes up the more times I run the command or the longer my computer goes without restarting. Sometimes it can take up to half an hour to run the command, but if I then restart my computer and run it again it completes in less than 5 minutes. I assume this might have something to do with containers not being terminated and consuming resources, but I would greatly appreciate if anyone knows how I could keep optimal performance without having to constantly reboot.

improving Container loading time using AWS ECS/EKS

we are running heavy ephemeral containers it takes about 20 seconds to initialize the container and less than 10 seconds to complete the task , im wondering if there is option to manage number of standby containers that are ready to job processing and save the initializing time
we are using CMD instruction to specify what component is to be run by teh image with arguments in the following form: CMD [“executable”, “param1”, “param2”…].
using #aws-batch
i want to be able to reduce the container loading time

Run-one alternative for alpine linux

I'm trying to create a docker image that would run a script every minute using cron. Most of the time it'll finish immediately, but sometimes it'll take 10 minutes. I need to make sure multiple instances of the script aren't run at the same time. On Ubuntu I've used the run-one package, but it seems missing from alpine. How can this be fixed?

Is it possible to limit the maximum number of containers per docker image?

Problem:
I have a couple of Docker images on a hosting server. I start multiple containers from a bunch of jenkins job. Due to limited capabilities of the host, I'd like to limit the maximum number of container per image. Setting the limit for the number of jenkins executors doesn't really solve the problem since some jobs can spin up 16 containers. It is possible though to split them into several threads of parallel executions, but this is still not ideal. I'd like to have one solution for all jobs
Question #1 (main):
Is it possible to set the maximum limit of containers Docker runs on a single machine to 10, and queue the rest of them?
Question #2:
If there is no such functionality or there are better options in this case, what is the workaroud for this?
One way is to use kubetnetes as mentioned above. But this is very time consuming route
A simpler way is to set a master job that spins up your containers. Your pipeline will be calling this job, eg 16 times spinning up 16 containers. Then set a maximum of executors on your jenkins host for example to 6. When you kick off your job it will be 1 executor plus 16 in queue, total 17. It will start first 6, and then wait until then will be stopped. Once any of running containers is done, it will allow the next container to run
My workaround is to clean unused containers and images once in a while with a job.
Here it is:
https://gist.github.com/fredericrous/26e51ed936d710364fe1d1ab6572766e

AWS Fargate startup time

Currently I'm researching on how our dockerised microservices could be orchestrated on AWS.
The Fargate option of ECS looks promising eliminating the need of managing EC2 instances.
Although it's a surprisingly long time needed to start a "task" in Fargate, even for a simple one-container setup. A 60 seconds to 90 seconds are typical for our Docker app images. And I heard it may take even more time like minutes or so.
So the question is: while Docker containers typically may start in say seconds what is exactly a reason for such an overhead in Fargate case?
P.S. The search on related questions returns such options:
Docker image load/extract time
Load Balancer influence -
registering, healthchecks grace period etc
But even in simplest possible config with no Load Balancer deployed and assuming the Docker image is not cached in ECS, it is still at least ~2 times slower to start task with single Docker image in Fargate (~ 60 sec) than launch the same Docker image on bare EC2 instance (25 sec)
Yes takes a little longer but we can't generalize the startup time for fargate. You can reduce this time tweaking some settings.
vCPU is directly impacting the start up time, So you have to keep in mind that in bare EC2 instance you have complete vCPU at your disposal , while in cases of fargate you may be assigning portion of it.
Since AWS manages servers for you they have to do few underline things. Assigning the VM into your VPC to docker images download/extract, assigning IPs and running the container can take this much time.
It's a nice blog and at the end of following article you can find good practices.
Analyzing AWS Fargate

Resources