By default, slave resources such as CPU and RAM are handled in a very basic way in Jenkins: by setting the number of executors, which is the number of allowed concurrent builds on this particular slave.
However, different jobs have different needs in terms of CPUs and RAM, so this model can quickly become sub optimal if you have jobs with different constraints: high RAM use, multi-threading, short and long jobs, etc.
Is there a way, for example, to say that a slave has 16GB of RAM, and then declare RAM consumption for each job?
After digging a bit more, it looks like the heavy job plugin [1] allows crude resource management, which is enough for my needs. I'd still like to know if people have better solutions to propose, though.
[1] https://wiki.jenkins.io/display/JENKINS/Heavy+Job+Plugin
Related
I am new to Jenkins and I have configured master-slave nodes as shown below, but I need help to configure the no of executors in each of the below slave nodes
Currently, I have configured 100 executors in each slave nodes
How many no of executors I can configure in each slave node and what fact(memory, RAM, etc) need to take consider when increasing the no of executors?
Maximum how many no of executors I can configure in each server?
Well it totally depends on your usage. There are multiple factors such as how much cpu and memory is available, how the build are going to execute and what kind of builds, how frequent these build should run etc. But I can clearly say that 100 is too big number. I would suggest go with 20 builds (if builds run frequently and have fair amount of CPU and memory) first and observe if is there any issue with numbers or not then you can increase accordingly.
here is very nice article check this out https://www.avantica.com/blog/jenkins-balance-load-master-slave-setup#:~:text=Jobs%20are%20built%20using%20executors,to%20build%20two%20different%20tasks.
I have set up a Jenkins master (linux) and slave (Windows) on Amazon EC2. My build is running on the slave and it is considerably slower than what I was seeing on a similar desktop machine (c4.large).
Checking into the load on the slave, neither the CPU cores, nor the memory is fully used (CPU at 60%, 30% each, memory stable at about 2.5GB)
However, I'm starting to think the bottleneck is in the network traffic. I can see that there seems to be a lot of network traffic going on between master and slave. It averages at about 500Kbs going out and 250Kbs coming in, but there are regular spikes going into the multiple Mbs.
I've searched around, but I can't really figure out what data Jenkins is sending.
The job is doing quite a bit of logging, but this is not the only source. There's a console log of about 15MB for a 2 hour build, which would translate to roughly (15*1024*8)/(2*60*60) = 17Kbs
So my questions:
What else is Jenkins communicating between master and slaves?
How can I reduce this?
Update: Just to be sure, I disabled all logging during the build. And it turns out that was the source of the traffic. I don't understand why there's so much traffic given my calculations above. Maybe there's some kind of polling/acknowledging going on that's causing a lot of overhead, I still don't know.
It's apparently also not the main source of the slowdown, but still, I would like to find a way to reduce this network traffic as this is going to impact the AWS bill.
We're building a web-app where users will be uploading potentially large files that will need to be processed in the background. The task involves calling 3rd-party APIs so each job can take several hours to complete. We're using DelayedJob to run the background jobs. With every user kicking off a background job, each of which will take a few hours to finish, that will add up to a lot of background jobs every quickly. I am wondering what would be the best way to setup the deployment for this? We're currently hosted on DigitalOcean. I've kicked off 10 DelayedJob workers. Each one (when ideal) takes up 157MB. When actively running it utilizes around 900 MB. Our user-base right now is pretty small so it's not an issue but will be one soon. So on a 4GB droplet, I can probably run like 2 or 3 workers at a time. How should we approach this issue? Should we be looking at using DigitalOcean's API to auto-spin cheap droplets on demand? Should we subscribe to high-memory droplets on a monthly basis instead? If we go with auto-spinning droplets, should we stick with DigitalOcean or would Heroku make more sense? Or is the entire approach wrong and should we be approaching it from an entire different direction? Any help/advice would be very much appreciated.
Thanks!
It sounds like you are limited by memory on the number of workers that you can run on your DigitalOcean host.
If you are worried about scaling, I would focus on making the workers as efficient as possible. Have you done any benchmarking to understanding where the 900MB of memory is being allocated? I'm not sure what the nature of these jobs are, but you mentioned large files. Are you reading the contents of these files into memory, or are you streaming them? Are you using a database with SQL you can tune? Are you making many small API calls when you could be using a batch endpoint? Are you assigning intermediary variables that must then be garbage collected? Can you compress the files before you send them?
Look at the job structure itself. I've found that background jobs work best with many smaller jobs rather than one larger job. This allows execution to happen in parallel, and be more load balanced across all workers. You could even have a job that generates other jobs. If you need a job to orchestrate callbacks when a group of jobs finishes there is a DelayedJobGroup plugin at https://github.com/salsify/delayed_job_groups_plugin that allows you to invoke a final job only after the sibling jobs complete. I would aim for an execution time of a single job to be under 30 seconds. This is arbitrary but it illustrates what I mean by smaller jobs.
Some hosting providers like Amazon provide spot instances where you can pay a lower price on servers that do not have guaranteed availability. These pair well with the many fewer jobs approach I mentioned earlier.
Finally, Ruby might not be the right tool for the job. There are faster languages, and if you are limited by memory, or CPU, you might consider writing these jobs and their workers in another language like Javascript, Go or Rust. These can pair well with a Ruby stack, but offload computationally expensive subroutines to faster languages.
Finally, like many scaling issues, if you have more money than time, you can always throw more hardware at it. At least for a while.
I thing memory and time is more problem for you. you have to use sidekiq gem for this process because it will consume less time and memory consumption for doing the same job,because it uses redis as database which is key value pair db.if the problem continues go with java script.
Is there any difference between I create two slaves, or one slave with two executors on the same Windows server?
Yes, there is a difference: It's about memory consumption and effort of maintenance/administration.
Starting a slave on a system starts a (main) process. This process costs (private) main memory to run and connects to the master.
Each executor is a sub-process of the main process.
It is therefore apparent that running two executors on one slave costs less memory in total compared to running two slaves (with one executor each), as there would be the memory consumption of the main process twice:
2 * Main Processes + 2 * Executors > 1 * Main Process + 2 * Executors
Moreover, administrating a slave is some more effort than just an executor: Whilst an executor has virtually nothing to worry, there are numerous things to configure for a slave. Additionally, the capabilities of the two slaves are anyhow the same (they are running on the same OS as you said), so there is little value-add to also assign it different labels.
In short, if there are no other boundary conditions, which make me do it differently, I always would prefer running two executors on one slave, as this is easier to administrate and some memory is saved.
A slave is a "machine". An executor is an "OS Process" in the slave.
So ideally we always add executors - they do the work and can run in parallel, and the simple theoretic answer to your question is "2 executors on one slave"
In practice we need to add slaves in several use cases:
We need more resources (more cpu, more memory, more "machines")
We need a different setting (Different OSes, Different hardware)
We have global resources that would create a conflict for executors on same machine (shared browser for a UI testing process)
Make the decision based on your use case.
One benefit which immediately comes to my mind for running 1 executor on given node, is to prevent conflicts between processes run at the same time.
On other hand you could prevent job conflicts using existing Jenkins plugins, ie. Heavy Job, Build Blocker.
How many Remote Nodes can Jenkins manage ? Are there any limitations/memory issues?
What is more effective:
1) 100 Nodes 1 executor per node ?
2) 5 Nodes with 20 executors per node ?
Tx.
As far as i know, there is no limitation on # of nodes one can have although your system might feel like saying, enough is enough! Issues such as number of processes per user (we got this issue recently, not with Jenkins but some other application where RAM and disk space were fine but the system stopped responding. We started getting system cannot fork() error), total number of open files etc. Few such issues might still be configurable but may not be allowed/feasible.
If resource (in your case, nodes) is not a constraint, which process wouldn't like to run wild? :) In practical cases, generally you wouldn't have the flexibility to opt for first option. In second case where you have 5 nodes with 20 executors, all you have to make sure is not to tie up jobs to a particular node unless you have a compelling reason.
Some slaves are faster, while others are slow. Some slaves are closer (network wise) to a master, others are far away. So doing a good build distribution is a challenge. Currently, Jenkins employs the following strategy:
If a project is configured to stick to one computer, that's always honored.
Jenkins tries to build a project on the same computer that it was previously built.
Jenkins tries to move long builds to slaves, because the amount of network interaction between a master and a slave tends to be logarithmic to the duration of a build (IOW, even if project A takes twice as long to build as project B, it won't require double network transfer.) So this strategy reduces the network overhead.
You should also have a look at these links:
https://wiki.jenkins-ci.org/display/JENKINS/Least+Load+Plugin
https://wiki.jenkins-ci.org/display/JENKINS/Gearman+Plugin