Jenkins - Running a single job addressing two nodes simultaneous - or intercommunicate between two jobs - jenkins

Jenkins newbie, but have other build-server experience.
I'm in progress of setting up a test job, where software on two nodes need to ping-pong with each other.
I have a pool of labeled nodes (lets call them A, running windows 7) to run the testing software and another pool of labeled nodes (lets call these B, running lubuntu 14.10).
The testing is performed through TCP/IP and needs various command line stimuli on both A and B nodes throughout the test. In the end I need to gather artifacts from both the A and B nodes.
I imagine the requirement and need to control multiple nodes simultaneous isn't so rare, but I'm really having a hard time locating info on this on the web.
Might there be a plugin for this that I've missed?
Below are my thoughts of what needs to be performer, should a single plugin not exist to help me out.
My preferred solution would be a single job, but then I need to find out how to perform following:
Check out from SVN to Node A.
Check out from SVN to Node B.
Execute Windows script on Node A.
Execute Linux script on Node B.
Collect artifact from Node A.
Collect artifact from Node B.
Alternative to all the even bullets above, might be to perform those actions using SSH from either the master or the A node to control the B Node, but that leaves the following questions:
How to select one B node out of the B node pool - and mark it in use?
How to use the Jenkins SSH/slave credentials?
A totally different alternative could be to set up two jobs, one for Node A's and one for Node B's. But then I need to find out how to perform the following:
Associate one Node A job with a Node B job, so they are both aware of the association.
Perform two-ways inter-communication, allowing the Node A job to wait for a signal from a Node B job and visa verse.
Eagerly looking forward to your answers!

In a similar scenario we use, we have two jobs (I'm not aware of a way to run a single job on two nodes).
Job A runs on node A and sets up a server and then triggers job B, which is set to run on node B (as a build step).
Job B will start its client app, which is configured to work with the server installed by A (an IP configuration in my case).
The job A (server) goes into a wait loop (using bash while) that checks if client running on B has connected (I use a flag file in a shared location).
Once connected, both jobs do a few more steps and complete. Each job will end with its own reporting.
I hope this helps.

In my case I've used Multi-configuration project with Slave axis. Then you can synchronize execution between nodes using your own code, or mix it with (system) Groovy script to communicate between different configurations.
ie.
def siblings = []
build.getParentBuild().getExactRuns().each { run ->
if (!build.is(run)) {
if (run.hasntStartedYet()) {
println "Build " + name + " hasn't started yet!"
} else {
if (!run.isBuilding()) {
println "Build " + name + " has already completed!"
break
}
executor = run.getExecutor()
if (executor) {
hostname = executor.getOwner().getHostName()
// In case hostname could not be found explicitly,
// then set it based on node's name.
if (!hostname) hostname = executor.getOwner().getName()
if (hostname) {
siblings.add(hostname)
}
}
}
}
}

Related

Are labels of Jenkins build slaves checked in a case sensitive manner for job scripts?

When i have two build clients, where one has a label of "Windows" (1st char is capitalized) and the other has a label of "windows" (all lower case), will i either need to write a job label formula of "(Windows || windows)" (assumed the case of the label is respected) or is either "Windows" or "windows" (assumed the comparison is case-insensitive) sufficient to freely run the job on any of both machines, whichever is first or free?
I have to ask, because i felt like i was unable to determine from docs in what fashion this is set up. (Some docs even indicate that some other check-operations are configurable in respect to case'ness.)
The Node labels are case sensitive in jenkins. So, When you write (Windows || windows) as a target node, jenkins will first try to run the job on the agent with label "Windows" in case if that agent doesn't respond then it will try to run the same job on the second agent with label "windows". If you want to run a job freely on any of the available agents then there are two way to accomplish that
Define the RegEx for those agents with OR (||) symbol (for example "Windows || windows"), which you already have.
Have the same label name on both agents (for example "windows") and have your job run with label "windows". It will run in a little different manner. In this case when you run that job with target label "windows", jenkins will send the request to both nodes but jenkins will run the job on the agent which will respond first.

Jenkins - only run a job on a node with 2+ executors free

We have a number of multi-jobs that run a parent job and multiple sub-jobs. The parent does some preprocessing, then kicks off the first sub-job.
Example:
Parent - checks out git repos and preps the code
Build the code
Unit Tests
Upload to HockeyApp
Since the parent is running the entire time the sub-jobs are running, the process starts out with one executor, then picks up a second whenever a sub-job starts. drops it and then picks it back up when the next starts.
We have 4 nodes with 3 - 4 executors on each of them. We also don't have a networked drive so the sub job has to stay on the same executor as the parent to avoid having to pass the entire workspace between jobs.
The problem is that if one job is running and has two executors, then another gets kicked off and then another right after that, there's a chance they'll all end up on the same node and something like below happens:
Node 1
Executor 1 - Parent1
Executor 2 - Child1
Executor 3 - Parent2
Executor 4 - Parent3
Now Parent2 and Parent3 just sit around waiting for a free executor. Eventually the Child job on Parent1 ends, then 2 r 3 grabs the executor and all of them are fighting for it.
Is there a way to tell Jenkins only kick off that parent on a node with at least 2 executors free? I know if someone started enough jobs quickly enough we could still end up with man issue, but it would significantly reduce the chance of it.
I think you can use - https://wiki.jenkins.io/display/JENKINS/Heavy+Job+Plugin , and define for each step the amount of free executers you need.

How to use "Scoring Load Balancer" plugin in Jenkins

I want to run a job on different slaves, but in a predefined order.
i.e. I would like the job to run on the second machine only when all the executors of the first slave are busy.
I am trying to use "Scoring Load Balancer" plugin to allocate scores to different slaves,i.e. if I have 3 nodes, NodeA,NodeB and NodeC, having preferences of 9,5 and 1 respectively and no. of executors 10 on each node.
The nodes have a common label "WindowSlave".
Also, I have defined a job,"ProjectX", with project preference score of 10 and the label "WindowSlave" as the preferred label.
I had expected that if I run 100 concurrent builds for the "ProjectX" then the execution would happen in the order :
NodeA(10 builds) -> NodeB(10 builds) -> NodeC(10 builds) -> NodeA(10 builds) -> NodeB(10 builds)->... and so on.
From my observations it is still not clear if above scenario would always be achieved.
Also it happens that any random slave starts behaving as the main slave and co-ordinates with the other slave,such that all the build workspace are created on that particular slave.
What am i missing here??

The impact of a distributed application configuration on node discovery via net_adm:ping/0

I am experiencing different behavior with respect to net_adm:ping/1 when being done in the context of a Distributed Application.
I have an application that pings a well-known node on start-up and in that way discovers all nodes in a mesh of connected nodes.
When I start this application on a single node (non-distributed configuration), the net_adm:ping/1 followed by a nodes/0 reports 4 other nodes (this is correct). The 4 nodes are on 2 different physical machines, so what is returned is the following n1#machine_1, n2#machine_2, n3#machine_2, n4#machine_1 (ip addresses are actually returned, not machine_x).
When part of a two-node distributed application, on the node where the application starts, the net_adm:ping/1 followed by a nodes/0 reports 2 nodes, one from each machine(n1#machine1, n2#machine2). A second call to nodes/0 after about a 750 ms delay results in the correct 5 nodes being found. Two of the three missing nodes are required for my application to work and so, not finding them, the application dies.
I am using R15B02
Is latency regarding the transitive node-discovery process known to be different when some of the nodes in the mesh are participating in distributed application configuration?
The kernel application documentation mentions the way to synchronize nodes in order to stop the boot phase until ready to move forward and everything is in place. Here are the options:
sync_nodes_mandatory = [NodeName]
Specifies which other nodes must be alive in order for this node to start properly. If some node in the list does not start within the specified time, this node will not start either. If this parameter is undefined, it defaults to [].
sync_nodes_optional = [NodeName]
Specifies which other nodes can be alive in order for this node to start properly. If some node in this list does not start within the specified time, this node starts anyway. If this parameter is undefined, it defaults to the empty list.
A file using them could look as follows:
[{kernel,
[{sync_nodes_mandatory, [b#ferdmbp, c#ferdmbp]},
{sync_nodes_timeout, 30000}]
}].
Starting the node a#ferdmbp by calling erl -sname a -config config-file-above. The downside of this approach is that each node needs its own config file.

Your advice on a Hadoop MapReduce job

I have 2 files stored on a HDFS filesystem:
tbl_userlog: <website url (non canonical)> <tab> <username> <tab> <timestamp>
example: www.website.com, foobar87, 201101251456
tbl_websites: <website url (canonical)> <tab> <total hits>
example: website.com, 25889
I have written an Hadoop sequence of jobs which joins the 2 files on the website, performs a filter on the amount of total hits > n per website and then counts for each user the amount of websites he has visited which has > n total hits. The details of the sequence are as following:
A Map-only job which canonicizes the url in tbl_userlog (i.e. removes www, http:// and https:// from the url field)
A Map-only job which sorts tbl_websites on the url
An identity Map-Reduce job which takes the output of the 2 previous jobs as KeyValueTextInput and feeds them to a CompositeInput in order to make use of Hadoop native joining feature defined with jobConf.set("mapred.join.expr", CompositeInputFormat.compose("inner" (...))
A Map and Reduce job which filters the result of the previous job on total hits > n in its Map phase, groups the results on the in the shuffling phase, and performs the count on the number of websites for each user in the Reduce phase.
In order to chain these steps, I just call the jobs sequentially in the described order. Each individual job outputs its results into HDFS which the following job in the chain then retrieves and processes in turn.
As I am new to Hadoop, I would like to ask for your counseling:
Is there a better way to chain these jobs? In this configuration all intermediate results are written to HDFS and then read back.
Do you see any design flaw in this job, or could it be written more elegantly by making use of some Hadoop feature that I have missed?
I am using Apache Hadoop 0.20.2 and using higher-level frameworks such as Pig or Hive is not possible in the scope of the project.
Thanks in advance for your replies!
I think what you have will work with a couple of caveats. Before I start listing them, I want to make two definitions clear. A map-only job is a job that has a defined Mapper and run's with 0 reducers. If the job is running with > 0 IdentityReducers, then the job is not a map-only job. A reduce-only job is a job that has a define Reducer and run's with an IdentityMapper.
Your first job, can be a map-only job, since all you're doing is canonicalizing URLs. But if you want to use CompositeInputFormat, you should run with an IdentityReducer with more than 0 reducer's.
For your second job, I don't know what you mean by a map-only job that sorts. Sorting by it's very nature is a reduce side task. You probably mean that it has a define Mapper but no Reducer. But in order for the URLs to be sorted, you should run with an IdentityReducer with more than 0 reducer's.
Your third job is an interesting idea, but you have to be careful with CompositeInputFormat. There are two conditions that must be met for you to be able to use this input format. The first is that there has to be the same number of files in both input directories. This can be achieved by setting the same number of reducer's for Job1 and Job2. The second condition is that the input files CANNOT be splittable. This can be achieved by using a non splittable compression such as bzip.
This job sounds good. Although you can filter website that have < n hits in the reducer of the previous job and save yourself some I/O.
There's obviously more than one solution to a problem in software, so while you're solution would work, I wouldn't recommend it. Having 4 MapReduce jobs for this task is a bit expensive IMHO. The implementation I have in mind is a M-R-R workflow that uses Secondary Sort.
As far as chaining jobs is concerned, you should have a look at Oozie, which is a workflow manager. I have yet to use it, but that's where I'd start.

Resources