We are using a third-party service to create and use vouchers. There are 80k+ vouchers already made. One of our cronjobs checks the status (used/unused) of each voucher one by one synchronously and updates it in our server database. It takes 2hours to complete one pass, then it continues from the first voucher for the next pass.
Constraints:
the third party supports the 6 queries per second(QPS).
We have only a primary Jenkins server and no agent nodes.
With one Jenkins server can we improve the execution time?
Can we set up multiple jobs executing parallelly on a primary Jenkins server for the same cronjob? Like the first 50k records are processed by one of the jobs and the rest are processed by another.
If you have room to vertical scale your VM in case you hit a resource(CPU, Memory) bottleneck, you should be able to achieve the performance. IMV best option is using Parallel stages in your Pipeline. If you know the batch sizes beforehand, you can hardcode the sizes within each stage, If you want to add some logic to determine how many records you have and then based on that allocate records, you can create a Pipeline with dynamic stages, something like below.
pipeline {
agent any
stages {
stage('Parallel') {
steps {
script {
parallel parallelJobs()
}
}
}
}
}
def getBatches() {
// Have some logic to allocate batches
return ["1-20000", "20000-30000", "30000-50000"]
}
def parallelJobs() {
jobs = [:]
for (batch in getBatches()) {
jobs[batch] = { stage(batch) {
echo "Processing Batch $batch"
}
}
}
return jobs
}
Related
I have a script that acts as a "test driver" (TD). That is, it drives test operations on a "system under test" (SUT). When I run my test framework script (tfs.sh) on my TD, it takes a SUT as an argument. The manual workflow looks like this:
TD ~ $ ./tfs.sh --sut=<IP of SUT>
I want to have a cluster of SUTs (they will have different OSes, and each will repeat a few times), and a few TDs (like, 4 or 5, so driving tests won't be a bottleneck, actually executing them will be).
I don't know the Jenkins primitive with which to accomplish this. I would like it if a Jenkins stage could simply be invoked with 2 agents. One would obviously be the TD, that's what would actually run the script. And the other would be the SUT. Jenkins would manage locking & resource contention like this.
As a workaround, I could simply have all my SUTs entirely unmanaged by Jenkins, and manually implement locking of the SUTs so 2 different TDs don't try to grab the same one. But why re-invent the wheel? And besides, I'd rather work on a Jenkins plugin to accomplish this than on a manual solution.
How can I run a single Jenkins stage on 2 (or more) agents?
If I understand your requirement correctly, you have a static list of SUTs and you want Jenkins to start the TDs by allocating SUTs for each TD. I'm assuming TDs and SUTs have a one-to-one relationship. Following is a very simple example of how you can achieve what you need.
pipeline {
agent any
stages {
stage('parallel-run') {
steps {
script {
try {
def tests = getTestExecutionMap()
parallel tests
} catch (e) {
currentBuild.result = "FAILURE"
}
}
}
}
}
}
def getTestExecutionMap() {
def tests = [:]
def sutList = ["IP1", "IP2" , "IP3"]
int count = 0
for(String ip : sutList) {
tests["TEST${count}"] = {
node {
stage("TD with SUT ${ip}") {
script {
sh "./tfs.sh --sut=${ip}"
}
}
}
}
count++
}
return tests
}
The above pipeline will result in the following.
Further if you wan to select the agent you want to run the TD. You can specify the name of the agent in the node block. node(NAME) {...} . You can improve the Agent selection criteria accordingly. For example you can check how many Jenkins executors are idling for a given Agent and then decide how many TDs you will start there.
Creating Build Monitor view with DSL Script, but there is no detail onto how to set the number of columns.
Using https://jenkinsci.github.io/job-dsl-plugin/#path/buildMonitorView documents for some insight. Thinking the configure function may allow but I still have the same question of how to do it.
Assumed it may have been like list view and add a column to it but this does not work.
My current code so far:
buildMonitorView('Automation Wall') {
description('All QA Test Suites ')
recurse(true)
configure()
columns(1)
jobs {
regex(".*.Tests.*")
}
}
buildMonitorView('Automation Wall') {
description('All QA Test Suites ')
recurse(true)
configure { project ->
(project / columns ).value = 1
}
jobs {
regex(".*.Tests.*")
}
}
After running tests in parallel, I need to immediately send out notifications. Currently, the parallel nodes are ran then node is given up and the send notifications sometimes waits for next available node.
// List of tasks, one for each marker/label type.
def farmTasks = ['ac', 'dc']
// Create a number of agent tasks that matches the marker/label type.
// Deploys the image to the board, then checks out the code on the agent and
// runs the tests against the board.
timestamps {
stage('Test') {
def test_tasks = [:]
for (int i = 0; i < farmTasks.size(); i++) {
String farmTask = farmTasks[i]
test_tasks["${farmTask}"] = {
node("linux && ${farmTask}") {
stage("${farmTask}: checkout on ${NODE_NAME}") {
// Checkout without clean
doCheckout(false)
}
stage("${farmTask} tests") {
<code>
}
} // end of node
} // end of test_tasks
} // end of for
parallel test_tasks
node('linux') {
sendMyNotifications();
}
} // end of Test stage
} // end of timestamps
Frankly, this code seems totally fine. I'm yet to understand how the notification needs to wait for a node (do you have more pipelines that use these agents? are there multiple instances of this pipeline running concurrently?), however the workaround to this issue is simple:
Set up another agent (it can reside on machines that already host existing agents) and give it a unique label (e.g. notifications) so its sole use will be to send notifications.
It's not perfect because you get a single point of failure, but it help remedy the situation while you figure out what causes the "real" agents to be unavailable after the parallel steps.
I'm new to Jenkins and configuring its scripts, so please forgive me if I say anything stupid.
I have a scripted Jenkins pipeline which redistributes building of the codebase to multiple nodes, implemented using a node block wrapped with parallel block. Now, the catch is that after the building, I would like to do a certain action with files that were just built, on one of the nodes that was building the code - but only after all of the nodes are done. Essentially, what I would like to have is something similar to barrier, but between Jenkins' nodes.
Simplified, my Jenkinsfile looks like this:
def buildConf = ["debug", "release"]
parallel buildConf.collectEntries { conf ->
[ conf, {
node {
sh "./checkout_and_build.sh"
// and here I need a barrier
if (conf == "debug") {
// I cannot do this outside this node block,
// because execution may be redirected to a node
// that doesn't have my files checked out and built
sh "./post_build.sh"
}
}
}]
}
Is there any way I can achieve this?
What you can do is add a global counter which counts the number of completed tasks, you need to instruct each task that have post job to wait until the counter is equal to the total number of tasks, first then you can do the post task parts. Like this:
def buildConf = ["debug", "release"]
def doneCounter = 0
parallel buildConf.collectEntries { conf ->
[ conf, {
node {
sh "./checkout_and_build.sh"
doneCounter++
// and here I need a barrier
if (conf == "debug") {
waitUntil { doneCounter == buildConf.size() }
// I cannot do this outside this node block,
// because execution may be redirected to a node
// that doesn't have my files checked out and built
sh "./post_build.sh"
}
}
}]
}
Please note, each task that has post task parts will block the executor until all other parallell tasks are done and the post part can be executed. If you have loads of executors or the tasks are fairly short, then this is probably not a problem. But if you have few executors it could lead to congestion. If you have less or equal number of executors than the total number of parallell tasks which need post work, then you can run into a deadlock!
I have a requirement to run a set of tasks for a build in parallel, The tasks for the build are dynamic it may change. I need some help in the implementation of that below are the details of it.
I tasks details for a build will be generated dynamically in an xml which will have information of which tasks has to be executed in parallel/serial
example:
say there is a build A.
Which had below task and the order of execution , first task 1 has to be executed next task2 and task3 will be executed in parallel and next is task 4
task1
task2,task3
task4
These details will be in an xml dynamically generated , how can i parse that xml and schedule task accordingly using pipeline plugin. I need some idea to start of with.
You can use Groovy to read the file from the workspace (readFile) and then generate the map containing the different closures, similar to the following:
parallel(
task2: {
node {
unstash('my-workspace')
sh('...')
}
},
task3: {
node {
unstash('my-workspace')
sh('...')
}
}
}
In order to generate such data structure, you simply iterate over the task data read using XML parsing in Groovy over the file contents you read previously.
By occasion, I gave a talk about pipelines yesterday and included very similar example (presentation, slide 34ff.). In contrast, I read the list of "tasks" from another command output. The complete code can be found here (I avoid pasting all of this here and instead refer to this off-site resource).
The kind of magic bit is the following:
def parallelConverge(ArrayList<String> instanceNames) {
def parallelNodes = [:]
for (int i = 0; i < instanceNames.size(); i++) {
def instanceName = instanceNames.get(i)
parallelNodes[instanceName] = this.getNodeForInstance(instanceName)
}
parallel parallelNodes
}
def Closure getNodeForInstance(String instanceName) {
return {
// this node (one per instance) is later executed in parallel
node {
// restore workspace
unstash('my-workspace')
sh('kitchen test --destroy always ' + instanceName)
}
}
}