Difference between agents and worker threads - jenkins

I'm working on running NUnit console runners using Jenkins. These tests connect to a Selenium Grid (which is also run by Jenkins), so I want to limit their level of parallelism in order to avoid getting agents starving while waiting for a free node on the grid.
So far I haven't managed to figure out what exactly is the difference between an agent and a worker thread in NUnit... I suspect the agent can manage threads, but it's only a guess. Thanks :)

An agent is a separate process running tests for an assembly. A worker is a thread, within a process, running the tests for a particular assembly.
Theoretically, an agent process could have multiple appdomains, each domain could have multiple assemblies and each assembly could have multiple worker threads.
Practically, however, the normal thing to do is to have one process per assembly, so that there is no need for multiple domains, and each process will run some specified number of worker threads to run tests for the assembly. In some contexts, you may prefer to only run processes in parallel and not have any parallelism within the assembly - it's the approach that is most likely to work without any change to your tests, which you may not have designed with parallelism in mind.
Agents do not "manage" threads. They simply run the framework in a process and the framework decides how many threads to use depending on the attributes you have applied.
Using multiple agents is the only way to run nunit V2 tests in parallel, since the v2 framework is ignorant of parallelism.

Related

Whats the right strategy of when to create jobs and sub jobs in sidekiq?

So I have a system that receives messages from devices and then it goes through 3 different servers and countless of services are run on each job. From an architecture perspective, whats certain considerations in using sidekiq to make my program async? Are there downsides to making sub processes run using sidekiq. Any advice?
architecture(system design) should be based on the problems you are trying to solve. if your services are design to unique business domains and if they are async compatible then you can spawn sub jobs for each service. but if not or your need flexible transactions among services then job per request is the right choice. so you may have both of these implementations in your system based on the requirements.
The upside to making your program async with sidekiq is that it is easy and produces good reporting in case of an error. The downsides of using sidekiq for this task is that there is a lot of overhead creating and executing the jobs. This could become such a problem that it represents the majority of the resources used.

Massive-Distributed Parallel Execution of tasks

We are currently struggling with the following task. We need to run a windows application (single instance only working) 1000 times with different input parameters. One run of this application can take up to multiple hours. It feels like we have the same problem like any video rendering farm – each picture of a video should be calculated independently and parallel – but it is not rendering.
Currently we tried to execute it with Jenkins and Pipeline jobs. We used the parallel steps in pipeline and lets Jenkins queue and execute the application. We use the Jenkins Label Expression to lets Jenkins choose which job can be run on which node.
The limitation in Jenkins is currently with massive parallel jobs (https://issues.jenkins-ci.org/browse/JENKINS-47724). When the queue contains multiple hundred jobs adding new jobs took much longer – will become even worse by increasing queue. And main problem: Jenkins will start the execution of parallel pipeline part-jobs only after finishing adding all to the queue.
We already investigated ideas how to solve this problem:
Python Distributed: https://distributed.readthedocs.io/en/latest/
a. For single functions it looks great, but for the complete run like we have in Jenkins => Deploy and collect results looks complex
b. Client->Server bidirectional communication needed – no chance to bring it online through a NAT (VM Server)
BOINC: https://boinc.berkeley.edu/
a. for our understanding we had to extend the backend in a massive way to bring our jobs working => to configure the jobs in BOINC we had to write a lot of new automating code
b. currently we need a predeployed application which can differ between different inputs => no equivalent of Jenkins Label Expression
Any ideas how to solve it?
Thanks in advance

Maximum number of TFs agents connected to a TFS instance

On Team Foundation (TFS2017) which is the maximum number of build agents that you can have connected to your TFS instance?
There is not any official document statement the limitation of build agent numbers with TFS for now. Also didn't get any related prompt info such as: build agents have reached the maximum.
For multiple machines, you could configure as much as you require, there is no evidently limitation.
For a single machine, it depends on the hardware. If your agent server is virtual, then it is already slower as compared to the physical, you also need to allocate sufficient RAM for it.
Can I install multiple private agents on the same machine?
Yes. This approach can work well for agents that run jobs that don't
consume a lot of shared resources.
You might find that in other cases you don't gain much efficiency
by running multiple agents on the same machine. For example, it might
not be worthwhile for agents that run builds that consume a lot of
disk and I/O resources.
You might also run into problems if concurrent build processes are
using the same singleton tool deployment, such as NPM packages. For
example, one build might update a dependency while another build is in
the middle of using it, which could cause unreliable results and
errors.
Source Link
It depends on how many cores agent server has. One Agent will take up one core.

How do I guarantee two delayed_job jobs aren't run concurrently while still allowing concurrency for other jobs?

I have a scenario where I have long-running jobs that I need to move to a background process. Delayed job with a single worker would be very simple to implement, but would run very, very slowly as jobs mount up. Much of the work is slow because the thread has to sleep to wait on various remote API calls, so running multiple workers concurrently is a very obvious choice.
Unfortunately, some of these jobs are dependent on each other. I can't run two jobs belonging to the same identifier simultaneously. Order doesn't matter, only that exactly one worker can be working on a given ID's work.
My first thought was named queues, and name the queue for the identifiers, but the identifiers are dynamic data. We could be running ID 1 today, 5 tomorrow, 365849 and 645609 the next, so on and so forth. That's too many named queues. Not only would giving each one a single worker probably exceed available system resources (as well as being incredibly wasteful since most of them won't be active at any given time), but since workers aren't configured from inside the code but rather as environment variables, I'd wind up with some insane config files. And creating a sane pool of N generic workers could wind up with all N workers running on the same queue if that's the only queue with work to do.
So what I need is a way to prevent two jobs sharing a given ID from running at the same time, while allowing any number of jobs not sharing IDs to run concurrently.

Correct way to implement standalone Grails batch processes?

I want to implement the following:
Web application in Grails going to MongoDB database
Long-running batch processes populating and updating that database in the background
I would like for both of them to reuse the same Grails services and same GORM domain classes (using mongodb plugin for Grails).
For the Web application everything should work fine, including the dynamic GORM finder methods.
But I cannot figure out how to implement the batch processes.
a. If I implement them as Grails service methods, their long-running nature will be a problem. Even wrapping them in some async executors will unnecessarily complicate everything, as I'd like them each to be a separate Java process so they can be monitored and stopped easily and separately.
b. If I implement them as src/groovy scripts and try to launch from command line, I cannot inject the Grails services properly (ApplicationHolder method throws NPE) or get the GORM finder methods to work. The standalone GORM guides all have Hibernate in mind and overall it seems not the right route to pursue.
c. I considered the 'batch-launcher' Grails plugin but it failed to install and seems a bit abandoned.
d. I considered the 'run-script' Grails command to run the scripts from src/groovy and it seems it might actually work in development, but seems not the right thing to do in production.
I cannot be the only person with such a problem - so how is it generally solved?
How do people run standalone scripts sharing the code base and DB with their Grails applications?
Since you want the jobs processing to be in a separate JVM from your front-end application, the easiest way to do that is to have two instances of Grails running, one for the front-end that serves web requests, and the other to deal with job processing.
Thankfully, the rich ecosystem of plugins for Grails makes this sort of thing quite easy, though perhaps not the most efficient, since running an entire Grails application just for processing is a bit overkill.
The way I tend to go about it is to write my application as one app, with services that take care of the job processing. These services are tied to the RabbitMQ plugin, so the general flow is that the web requests (or quartz scheduled jobs) put jobs into a work queue, and then the worker services take care of processing them.
The advantage with this is that, since it's one application, I have full access to all of the domain objects, etc., and I can leverage the dissconnected nature of a message queue to scale out my front- and back-ends seperately without needing more than one application. Instead, I can just install the same application multiple times and configure the number of threads dedicated to processing jobs and/or the queues that the job processors are looking at.
So, with this setup, for development, I will usually just set the number of job processing threads to whatever makes sense for the development work I'm doing, and then just a simple grails run-app, and I have a fully functional system (assuming I have a RabbitMQ server running to use as well).
Then, when I go to deploy into production, I deploy 2 instances of the application, one for the front-end work and the other for the back-end work. I just configure the front-end instance to have 1 or 0 threads for processing jobs, and the back-end instance I give many more threads. This lets me update either portion as needed or spin up more instances if I need to scale one part or the other.
I'm sure there are other ways to do this, but I've found this to be both really easy to develop (since it's all one application), and also really easy to deploy, scale, and maintain.

Resources