How to create complex value stream with multiple pipelines with Jenkins WorkFlow

How to create complex value stream with multiple pipelines with Jenkins WorkFlow - jenkins

How do you implement a complex value stream with multiple pipelines in Jenkins WorkFlow? Similar like you can do with Go CD: How do I do CD with Go?: Part 2: Pipelines and Value Streams.
For a distributed system I would like to have each dev team and operation team to start with their own delivery pipeline. One change needs to trigger only the pipeline of the team that made the change. It needs to trigger a new pipeline that needs to take the latest successful artifacts from each of the team's pipelines and move on from there. This mean that the artifacts from the other teams were not rebuild or retested as they were not changed. And after the Fan In we can run a set of automated tests to verify the correct behaviour of the distributed system with the change.
In the documentation I only find you can pull from multiple VCS's but I assume everything is then build and tested with every change. Which is something I want to avoid.
If each delivery pipeline is in it's own Jenkins Job. How can I visualize the complete pipeline and what is the best way to pull in the last successful artifacts or version from the other pipelines?

There is no direct equivalent in Jenkins for value streams, and Workflow jobs do not behave any differently in that respect: you can have upstream jobs and downstream jobs correlated with triggers (in this case the build step, or the core ReverseBuildTrigger), and use (for example) the Copy Artifact plugin to transfer artifacts to downstream builds. Similarly, you could use an external repository manager as the “source of truth” and define job triggers based on snapshots pushed to the repository.
That said, part of the purpose of Workflow is to avoid the need for complex job chains in most situations¹, since it is usually easier to reason about, debug, and customize a single script with standard control flow operators and local variables than to manage a set of interdependent jobs. If the main problem with a single flow is that you need to avoid rebuilding unmodified parts, one solution would be to use something like JENKINS-30412 to check the changelog of particular repository checkouts and skip build steps if empty. I think there would be more features needed to make such a system work in the general case that workspaces are clobbered or discarded by other builds.
¹One case where you definitely need separate jobs is that for security reasons the teams contributing to different projects must not be able to see one another’s sources or build logs.

Assuming that each of your dev teams works on a different module of your project and „One change needs to trigger only the pipeline of the team that made the change“ I'd use Git Submodules:
Submodules allow you to keep a Git repository as a subdirectory of another Git repository.
with one repo, that becomes a submodule of a main module repo, for each team. This will be transparent to the teams since they just work on their designated repos only.
The main module is also the aggregator project for your module projects in terms of the build tool. So, you have the options:
to build each repo/pipeline individually or
to build the whole (main) project at once.
A build pipeline that comprises one or more build jobs is associated to every team/repo/module.
The main pipeline is merely a collection of downstream jobs which represent the starting points of the team/repo/module pipelines.
The build triggers can be any of manually, timed or on source changes.
A decision has also to be made:
whether you version your modules individually, such that other modules depend on release versions only.
Advantage:
Others rely on released, usually more stable versions.
Modules can decide which version of a dependency they want to use.
Disadvantages:
Releases have to be prepared for each module.
It may take longer until the latest changes are available to others.
Modules have to decide which version of a dependency they want to use. And they have to adapt it every time they need functionality added in a newer version.
or whether you use one version for the entire project (which is inherited by the modules then): ...-SNAPSHOT during the development cycle, a release version when releasing the project.
In this case, if there are modules that are essential for others, e.g. a core module, a successful build of it should trigger a build of the dependent modules, as well, so that incompatibilities are recognized as early as possible.
Advantages:
Latest changes are immediately available to others.
A release is prepared for the whole project only once it is to be delivered.
Disadvantages:
Latest changes immediately available to others may introduce not so stable (snapshot) code.
Re „How can I visualize the complete pipeline“
I'm not aware of any plugin that can do this with Workflows at the moment.
There's the Build Graph View Plugin which originally has been created for Build Flows, but it's more than two years old now:
Downstream builds are identified by DownStreamRunDeclarer extension point.
Default one is using Jenkins dependencyGraph and UpstreamCause and as such can detect common build chain.
build-flow plugin is contributing one to render flow execution as a graph
some Jenkins plugins may later contribute dedicated solutions.
(You know, „may“ and „later“ often become will not and never in development. ;)
There's the Build Pipeline Plugin but it apparently is also not suitable for Workflows:
This plugin provides a Build Pipeline View of upstream and downstream connected jobs [...]
Re „way to pull in the last successful artifacts“
Apparently it's not that smooth with Gradle:
By default, Gradle does not define any repositories.
I'm using Maven and there exist local and remote repositories where the latter can also be:
[...] internal repositories set up on a file or HTTP server within your company, used to share private artifacts between development teams and for releases.
Have you considered using a binary repository manager like Artifactory or Nexus?

From what I have seen, people are moving towards smaller, independent pieces of code delivery rather than monolithic deployments. But clearly, there will still be dependencies between different components. At the very least, for example, if you had one script that provisioned your infrastructure and another that built and deployed your app, you would want to be sure your infrastructure update script was run before your app deployment. On the other hand, your infrastructure does not depend on deploying your app code - it can be updated at its own pace, so long as it ideally passes some testing.
As mentioned in another post, you really have two options to accomplish this dependency:
Have a single pipeline (workflow script) that checks out code from both repos and puts them through the same pipeline simultaneously. Any change to one requires the full boat pipeline for everything.
Have two pipelines and this would allow each to go at its own pace independent of what the other does. This isn't a problem for the infrastructure code, but it very well could be for the app code. If you pushed your app code to production without the infrastructure update having happened first, the results may not be pleasant.
What I've started to do with Jenkins Workflow is establish a dependency between my flows. Basically, I declare that one flow is dependent on a particular version (in this case, simply BUILD_NUM) and so before I do a production deploy I verify that the last successful build of the other pipeline has completed first. I'm able to do this using the Jenkins API as part of my flow script that waits for that build or greater to succeed, like so
import hudson.EnvVars
import hudson.model.*
int indepdentBuildNum = 16
waitUntil{
verifyDependentPipelineCompletion("FLDR_CM/WorkflowDepedencyTester2", indepdentBuildNum)
}
boolean verifyDependentPipelineCompletion(String jobName, int buildNum){
def hi = jenkins.model.Jenkins.instance
Item dep2 = hi.getItemByFullName(jobName)
hi = null
def jobs = dep2.getAllJobs().toArray()
def onlyJob = jobs[0] //always 1 job...I think?
def targetedBuild = onlyJob.getLastSuccessfulBuild()
EnvVars me = targetedBuild.getCharacteristicEnvVars()
def es = me.entrySet()
int targetBuildNum = 0;
def vars = es.iterator()
while(vars.hasNext()){
def envVar = vars.next()
if(envVar.getKey().equals("BUILD_ID")){
targetBuildNum = Integer.parseInt(envVar.getValue())
}
}
if (buildNum > targetBuildNum) {
return false
}
return true
}
Disclaimer that I am just beginning this process so I do not have much real-world experience with this yet, but will update this thread if I have more relevant information. Any feedback welcome.

Related

Best route to take for a Jenkins job with hundreds of sub jobs

Currently, at my organization we have a few repositories which contain ~500+ projects that need to be built to satisfy unit testing (really integration testing), and I am trying to think of a new way of approaching the situation.
Currently, the pipeline for building the projects is templatized and is stored on our Bitbucket server. All the projects get built in parallel, so once the jobs are queued, they all go to the master node to do a SCM check of the pipeline.
This creates stress on the master node, and for some reason it is not able to utilize every available node and executor on that node to it's fullest potential. Contrary, if the pipeline is not stored on SCM, it does the complete opposite to where it DOES use every possible node with any available executor on that node.
Is there something I am missing about the SCM checkout version that makes it different than storing the pipeline locally on Jenkins? I understand that you need to do an SCM poll, and I am assuming only the master can do the SCM poll for the original Jenkinsfile.
I've tried:
Looking to see if I am potentially throttling the build, but I do not see anything
Disable concurrent builds is not enabled within the pipeline
Lightweight checkout seems to work when I do it with Git plugin, but not the Bitbucket Server Integration plugin; however, Atlassian mentioned this will never be a feature, so this doesn't really matter.
I am trying to see if there is a possible way to change the infrastructure since I don't have much of a choice in how certain programs are setup since they are very tightly coupled.
I could in theory just have the pipeline locally on Jenkins and use that as a template rather than checking it into SCM; however, making changes locally to the template does not change the sub-jobs that uses it (I could implement this feature, but SCM already does it). Plus, having the template pipeline checked into Bitbucket allows a better control, so I am trying to avoid that option.

Jenkins CI workflow with separate build and automated test both in source control

I am trying to improve our continuous integration process using Jenkins and our source control system (currently svn, but git soon).
Maybe I am thinking about this overly complicated, or maybe I have not yet seen the right hints.
The process I envisioned has three steps and associated roles:
one or more developers would do their job and ultimately submit the code changes for the actual software ("main software") as well as unit tests into source control (git, or something else). Jenkins shall build the software, run unit tests and perhaps some other steps (e.g. static code analysis). If none of this fails, the work of the developers is done. As part of the build, the build number is baked into the main software itself as part of the version number.
one or more test engineers will subsequently pickup the build and perform tests. Some of them may be manual, most of them are desired to be automated/scripted tests. These shall ultimately be submitted into source control as well and be executed through the build server. However, this shall not trigger a new build of the main software (since there is nothing changed). If none of this fails, the test engineers are done. Note that our automated tests currently take several hours to complete.
As a last step, a project manager authorizes release of the software, which executes whatever delivery/deployment steps are needed. Also, the source of the main software, unit tests, and automated test scripts, the jenkins build script - and ideally all build artifacts ("binaries") - are archived (tagged) in the source control system.
Ideally, developers are able to also manually trigger execution of the automated tests to "preview" the outcome of their build.
I have been unable to figure out how to do this with Jenkins and Git - or any other source control system.
Jenkin's pipelines seem to assume that all steps are carried out in sequence automatically. It also seems to assume that committing code into source control starts at the beginning (which I believe is not true if the commit was "merely" automated test scripts). Triggering an unnecessary build of the main software really hurts our process, as it basically invalidates and manual testing and documentation, as it results in a new build number baked into the software.
If my approach is so uncommon, please direct me how to do this correctly. Otherwise I would appreciate pointers how to get this done (conceptually).

I will try to reply with some points. This is indeed conceptually approach as there are a lot of details and different approaches too, this is only one.
You need git :)
You need to setup a git branching strategy which will allow to have multiple developers to work simultaneously, pushing code and validating it agains the static code analysis. I would suggest that you start with Git Flow, it is widely used and can be adapted to whatever reality you do have - you do not need to use it in its pure state, so give some thought how to adapt it. Fundamentally, it will allow for each feature to be tested. Then, each developer can merge it on the develop branch - from this point on, you have your features validated and you can start to deploy and test.
Have a look at multibranch pipelines. This will allow you to test the several feature branches that you might have and to have different flows for the other branches - develop, release and master - depending on your deployment needs. So, when you have a merge on develop branch, you can trigger testing or just use it to run static code analysis.
To overcome the problem that you mention on your second point, there are ways to read your change sets on the pipeline, and in case the changes are only taken on testing scripts, you should not build your SW - check here how to read changes, and here an example of how to read changes and also to prevent your pipeline to build all the stages according to the changes being pushed to git.
In case you still have manual testing taking place, pipelines are pausable which means that you can pause the pipeline asking for approval to proceed. Before approving, testers should do whatever they have to, and whenever they are ready to proceed, just approve the build to proceed for the next steps.
Regarding deployments authorization, it is done the same way that I mention on the last point, with the approvals, but in this case, you can specify which users/roles are allowed to approve that step.
Whatever you need to keep from your builds, Jenkins has an archive artifacts utility. Let me just note that ideally you would look into a proper artefact repository such as Nexus.
To trigger manually a set of tests... You can have a manually triggered job on Jenkins apart from your CI/CD pipeline, that will only execute the automated tests. You can even trigger this same job as one pipeline stage - how to trigger another jobs
Lastly let me say that the branching strategy is the starting point.
Think on your big picture, what SDLC flows you need to have and setup those flows on your multibranch pipeline. Different git branches will facilitate whatever flows you need within the same Jenkinsfile - your pipeline definition. It really depends on how many environments you have to deploy to and what kind of steps you need.

Jenkins: a heavily branched chain of build jobs

We would like to set up Continuous Integration and Continuous Deployment processes on the base of Jenkins ecosystem. Currently we're trying to put together all the Jenkins build jobs we have (from sources to several endpoint processes launched on the testing server). There are three kinds of build/deployment processes in our case:
Building deb packages from C++ projects (some of them are dependent, others are dependencies);
Building images from Docker containers;
Launching some processes in the endpoint;
As you can notice, we faced with a heavily branched chain of jobs triggered by each other. And every update of any of the upstream projects must go throughout the chain of jobs and trigger the final job (process I). So It would be nice to use some kind of Jenkins plugins that will:
Control such a complicated structure of jobs (I tried to use Build Pipeline Plugin and I got the impression that this tool is suitable for "linear" job chain);
Provide clean way of passing the parameters between job environments.

As #slav mentioned, the Workflow plugin should be able to handle this kind of complex control flow, including parallel handling of subtasks, simple handling of variables throughout the process (would just be Groovy local variables), and Docker support.
You could of course arrange this whole process in a single build.gradle (or Makefile for that matter). That would be suitable if you did not mind running all the steps on the same Jenkins slave, and did not otherwise need to interact with or report to Jenkins in any particular way in the middle of the build.

Well, for passing parameters, you should be using Parameterized Trigger Plugin.
For a more asynchronous passing of parameters, you an use EnvInject plugin (it's extremely useful and flexible for all sorts of things, and considering your complexity, might prove useful regardless if you use it for passing parameters or not)
As for control, research into Workflow plugin. It allows to write the whole execution flow in it's own Groovy script, with fine granular control. More links:
Official - https://jenkins-ci.org/content/workflow-plugin-10
Tutorial - https://github.com/jenkinsci/workflow-plugin/blob/c15589f/TUTORIAL.md#pausing-flyweight-vs-heavyweight-executors

Jenkins continuous integration and nightly builds

I’m new to Jenkins and I like some help (reassurance) about how I think I should setup my jobs.
The end goal is fairly simple.
Objective 1: When a developer commits code to a mercurial repo Jenkins pulls the changes, builds the project and runs the unit tests. This happens continuously throughout the day so developers get the earliest possible feedback if they break something.
Objective 2: Nightly, Jenkins pulls the last stable build from above and runs automated UI tests. If those tests pass it publishes the nightly build somewhere.
I have a job configured that achieves objective 1 but I’m struggling with objective 2.
(Not the publishing part, the idea of seeding this job with the last stable build of objective 1).
At the moment, I’m planning to use branches in the HG repo to implement this.
My branches would look something like Main >> Int >> Dev.
The job in objective 1 would work on the tip of the Dev branch.
If the build succeeds and the tests pass it would commit to the Int branch.
The job in objective 2 could then simply work on the tip of the Int branch.
Is this how it’s generally done?
I’ve also been looking at/considering:
- plugins like Promoted Builds and Copy Artifacts
- parameterised builds
- downstream jobs
IMO my objectives are fairly common but I can’t find many examples of this approach online. Perhaps it’s so obvious there was no need but I just wanted to check.

In the past I've stored generated artifacts like this in an artifact repository. You could use something like Nexus or Artifactory for this, but I've also just used a flat file system.
You could put the build artifacts in source control, like you said, but there usually isn't a reason to have version control on compiled builds (you should be able to re-create them based on rev numbers) - they usually just take up a lot of space in your repo.
If your version numbers are incremental in nature your nightly job should be able to pull the latest one fairly easily.

Maybe you can capture the last good revision ID and post it somewhere. Then the nightly build can use that last known good revision. The method to go about doing this can vary but its the concept of using revision ID that I want to communicate here. This would prevent you from having to create a separate branch.

Jenkins - Running instances of single build concurrently

I'd like to be able to run several builds of the same Jenkins job simultaneously.
Example:
Build [*jenkins_job_1*]: calls an ant script with parameter 'A'
Build [*jenkins_job_1*]: calls an ant script with parameter 'B'
repeat as necessary
each instance of the job runs simultaneously, rather than through a queue.
The reason I'd like to do this is to avoid having to create several jobs that are nearly identical, all of which would need to be maintained.
Is there a way to do this, or maybe another solution (ie — dynamically create a job from a base job and remove it after it's finished)?

Jenkins has a check box: "Execute concurrent builds if necessary"
If you check this, then it'll start multiple builds for a job.
This works with the "This build is parameterized" checkbox.
You would still trigger the builds, passing your A or B as parameters. You can use another job to trigger them or you could do it manually via a script.

You can select Build a Multi-configuration project (Matrix build) when you create the job. Then, under the job's configuration, you can define the Configuration Matrix which lets you specify one or more parameters (axes) for different builds. Regarding running simultaneously, you should be able to run as many simultaneous builds as you have executors (with the appropriate label).
Unfortunately, the Jenkins wiki lacks documentation about this setup. There are a couple previous SO questions, here and here, that might provide a little guidance. There was a "recent" blog post about setting up a multi-configuration job to perform builds on various platforms.

A newer (and better) solution is the Jenkins Job DSL Plugin.
We've been using it with great success. Our job configurations are now disposable... we can set up a huge stack of complicated jobs from some groovy files and a couple template jobs. It's great.
I'm liking it a lot more than the matrix builds, which were complicated and harder to understand.

Nothing stopping you doing this using the Jenkins pipeline DSL.
We have the same pipeline running in parallel in order to model combined loads for an application that exposes web services, provides a database to several external applications, receives data via several work queues and has a GUI front end. The business gives us non-functional requirements (NFRs) which our application must meet that guarantees its responsiveness even at busy times.
The different instances of the pipeline are run with different parameters. The first instance might be WS_Load, the second GUI_Load and the third Daily_Update_Load, modelling a large data queue that needs processing within a certain time-frame. More can be added depending on which combination of loads we're wanting to test.
Other answers have talked about the checkboxes for concurrent builds, but I wanted to mention another issue: resource contention.
If your pipeline uses temporary files or stashes files between pipeline stages, the instances can end up pulling the rug from under each others' feet. For example you can end up overwriting a file in one concurrent instance while another instance expects to find the pre-overwritten version of the same stash. We use the following code to ensure stashes and temporary filenames are unique per concurrent instance:
def concurrentStash(stashName, String includes) {
/* make a stash unique to this pipeline and build
that can be unstashed using concurrentUnstash() */
echo "Safe stashing $includes in ${concurrentSafeName(stashName)}..."
stash name: concurrentSafeName(stashName), includes: includes
}
def concurrentSafeName(name) {
/* make a name or name component unique to this pipeline and build
* guards against contention caused by two or more builds from the same
* Jenkinsfile trying to:
* - read/write/delete the same file
* - stash/unstash under the same name
*/
"${name}-${BUILD_NUMBER}-${JOB_NAME}"
}
def concurrentUnstash(stashName) {
echo "Safe unstashing ${concurrentSafeName(stashName)}..."
unstash name: concurrentSafeName(stashName)
}
We can then use concurrentStash stashName and concurrentUnstash stashName and the concurrent instances will have no conflict.
If, say, the two pipelines both need to store stats, we can do something like this for filenames:
def statsDir = concurrentSafeName('stats')
and then the instances will each use a unique filename to store their output.

You can create a build and configure it with parameters. Click the This build is parameterized checkbox and add your desired param(s) in the Configuration of the build. You can then fire off simultaneous builds using different parameters.
Side note: The "Bulk Builder" in Jenkins might push it into a queue, but there's also a This bulk build is parameterized checkbox.

I was having a pretty large build queue and I performed below steps to run jobs in
parallel in jenkins to reduce number of jobs waiting in queue
For each job you need to navigate to configure and select the checkbox stating
"Execute concurrent builds if necessary"
Navigate to Manage -> Configure System -> look for "# of executors" and set the no
of parallel executors you want (in my case it was set to 0 and I updated it to 2)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart