I am working under some constraints that I currently cannot change, and while I can work around the constraints, I feel like all my solutions are suboptimal, and what I am looking for is suggestions on the most frictionless way to resolve the issue. I suspect that my lack of experience with Jenkins/groovy is holding me back.
The constraints:
When a pr is created, some background tasks are triggered, and these are shared amongst teams. Furthermore the PR is changed: Dependencies resolved, version bumping... and more stuff like that. All further work must be on the modified commit, and the commit must be tested before the pipeline creating it finishes.
The effort is changing the pipeline to a declarative flow, and the solution should deviate as little from that as possible. Optimally it is also entirely in the scripts, as access to Jenkins is limited. All steps subsequent to the one creating the commit are dockized, and run on different agents physical machines without a shared file system.
The goal:
With as little spam as possible, and as little code (that can later be forgotten). How is the rest of the pipeline best executed on the new commit?
The attempts so far:
At each stage, check out the new commit (disabling or cleaning the checkout by the declarative pipeline). This results in code bloat, and if someone should forget it, the failure is silent as it just tests the original commit.
Launch a new pipeline/build. This is promising until I realized I have no idea how to launch a build on a separate commit without falling back on 1.
Possible silver bullets
A method to specify in a step, what commit the following steps should checkout/work on.
A method to launch a new build/pipeline on a commit. I know how to propagate any errors.
I apologize if I am missing something obvious, or if the constraints seems messy. I can guarantee you, they frustrate me more than they do you :)
Related
I am trying to improve our continuous integration process using Jenkins and our source control system (currently svn, but git soon).
Maybe I am thinking about this overly complicated, or maybe I have not yet seen the right hints.
The process I envisioned has three steps and associated roles:
one or more developers would do their job and ultimately submit the code changes for the actual software ("main software") as well as unit tests into source control (git, or something else). Jenkins shall build the software, run unit tests and perhaps some other steps (e.g. static code analysis). If none of this fails, the work of the developers is done. As part of the build, the build number is baked into the main software itself as part of the version number.
one or more test engineers will subsequently pickup the build and perform tests. Some of them may be manual, most of them are desired to be automated/scripted tests. These shall ultimately be submitted into source control as well and be executed through the build server. However, this shall not trigger a new build of the main software (since there is nothing changed). If none of this fails, the test engineers are done. Note that our automated tests currently take several hours to complete.
As a last step, a project manager authorizes release of the software, which executes whatever delivery/deployment steps are needed. Also, the source of the main software, unit tests, and automated test scripts, the jenkins build script - and ideally all build artifacts ("binaries") - are archived (tagged) in the source control system.
Ideally, developers are able to also manually trigger execution of the automated tests to "preview" the outcome of their build.
I have been unable to figure out how to do this with Jenkins and Git - or any other source control system.
Jenkin's pipelines seem to assume that all steps are carried out in sequence automatically. It also seems to assume that committing code into source control starts at the beginning (which I believe is not true if the commit was "merely" automated test scripts). Triggering an unnecessary build of the main software really hurts our process, as it basically invalidates and manual testing and documentation, as it results in a new build number baked into the software.
If my approach is so uncommon, please direct me how to do this correctly. Otherwise I would appreciate pointers how to get this done (conceptually).
I will try to reply with some points. This is indeed conceptually approach as there are a lot of details and different approaches too, this is only one.
You need git :)
You need to setup a git branching strategy which will allow to have multiple developers to work simultaneously, pushing code and validating it agains the static code analysis. I would suggest that you start with Git Flow, it is widely used and can be adapted to whatever reality you do have - you do not need to use it in its pure state, so give some thought how to adapt it. Fundamentally, it will allow for each feature to be tested. Then, each developer can merge it on the develop branch - from this point on, you have your features validated and you can start to deploy and test.
Have a look at multibranch pipelines. This will allow you to test the several feature branches that you might have and to have different flows for the other branches - develop, release and master - depending on your deployment needs. So, when you have a merge on develop branch, you can trigger testing or just use it to run static code analysis.
To overcome the problem that you mention on your second point, there are ways to read your change sets on the pipeline, and in case the changes are only taken on testing scripts, you should not build your SW - check here how to read changes, and here an example of how to read changes and also to prevent your pipeline to build all the stages according to the changes being pushed to git.
In case you still have manual testing taking place, pipelines are pausable which means that you can pause the pipeline asking for approval to proceed. Before approving, testers should do whatever they have to, and whenever they are ready to proceed, just approve the build to proceed for the next steps.
Regarding deployments authorization, it is done the same way that I mention on the last point, with the approvals, but in this case, you can specify which users/roles are allowed to approve that step.
Whatever you need to keep from your builds, Jenkins has an archive artifacts utility. Let me just note that ideally you would look into a proper artefact repository such as Nexus.
To trigger manually a set of tests... You can have a manually triggered job on Jenkins apart from your CI/CD pipeline, that will only execute the automated tests. You can even trigger this same job as one pipeline stage - how to trigger another jobs
Lastly let me say that the branching strategy is the starting point.
Think on your big picture, what SDLC flows you need to have and setup those flows on your multibranch pipeline. Different git branches will facilitate whatever flows you need within the same Jenkinsfile - your pipeline definition. It really depends on how many environments you have to deploy to and what kind of steps you need.
What is a sane development workflow for writing jenkins global pipeline libraries and jenkinsFiles? It's kind of a pain to check in my changes to the global pipeline library and then run a build w/ retry to modify the jenkinsFile, then save the diff if it takes a couple iterations.
Anybody have any recommendations? What do you do?
There is a 3rd-party unit testing framework for Jenkins pipelines: lesfurets/JenkinsPipelineUnit. This also covers shared libraries and allows you to verify the call stack of your pipeline scripts.
Just based on the small amount of context from your question, I can share what I've learned. YMMV.
Source-control the library and make your changes on a branch. You might need another repo to act as a guinea pig to test the new changes - add #branchname to the library declaration in its Jenkinsfile.
Run and test your code on a Jenkins instance running on your local machine. It may not match exactly your production instance, but it's much closer, and therefore faster feedback, to your local. Also, you don't have to push your changes to test - commit locally, test, un-commit, change, commit, test, repeat. When you are done or need to test on the production instance, push.
Use the Script Console as a Groovy scratchpad to test code in. It's missing a lot of plugins/features, but it's great for throwing together some test code to make sure the basics work.
Create small throwaway pipelines to test bits of functionality that you want to iterate quickly on before putting it in a real pipeline, and that won't work in the script console. This lets you focus on the functionality you're building and not worry about the other bits.
Important note: I know of at least one function that doesn't work in the script console, but works fine in a real pipeline: readJSON. It throws an error as if you're doing something wrong but it's just broken in the console. I'm sure there are others.
I'll come back and add more as i think of it.
Folks,
I'm actually having loads of fun since switching to multibranch pipelines in Jenkins which i use in combination with GitLab.
But something i still do not wrap my head around is how to build merge request that originates from a fork - the ones coming from the same remote triggers a build but not the ones from my fork !
I'd be really happy to hear any idea about this.
Thanks a lot community !
Maybe you could try a different approach:
A multibranch pipeline is for continuous build (with only compilation and testing). This is basically a quick feedback on our current work.
Building a MR is altogether another business. It is supposed to provide all necessary elements to the reviewer to determine wether a MR can be merged on the master branch or not. Therefore, it could imply code quality analysis, security analysis, gating, dashboards...
With this respect, it should be separated in different jobs.
Having two seperate jobs for the two functionalities is not only cleaner, but I believe, it would solve your trigger from fork problem.
Although this question specifically involves Gradle and Bamboo, it really is a question about any build system (Ant/Maven/Gradle/etc.) and any CI tool (Bamboo/Jenkins/Hudson/etc.).
I was always under the impression that the purpose of a CI build is to:
Check out code from VCS
Run a buildscript (Gradle, etc.)
Deploy a binary (WAR, etc.) to an environment
Hence, all the guts and heavy-lifting (running automated tests, code analysis, test coverage, compiling, Javadocs, packaging, etc.) was all to be done from inside the buildscript.
But Bamboo seems to allow you to break this heavy-lifting out of the buildscript and into Bamboo itself. In Bamboo, you can add build stages and decompose the stages into tasks. Each task is something just as atomic/fundamental as an Ant task.
So it got me thinking: how much should one empower the CI tool? What typical buildscript functionality should be transferred over to Bambooo/CI? For instance, should I be compiling from a Gradle task, or from a Bamboo task? Same goes for all tasks/stages.
For some reason, I view this as the same problem as to whether or not to use stored procedures or put the data processing all at the application layer. What are the pros/cons of each approach?
TL;DR at the bottom
My experience is with Jenkins, so examples will relate to that.
One thing with any build system (be it CI server or a buildscript), is that it should be stable, simple and self-contained so that an untrained receptionist (with printed instructions and proper credentials) could do it.
Ease of use and re-use
Based on the above, one would think that a buildscript wins. Not always. As with the receptionist example, it's about easy of use and easy of reproducibility.
If a buildscript has interdependent build targets that only work in correct order, dependence on pre-supplied property files that have to be adjusted for the correct branch ahead of build, reliance on environment variables that no-one remembers who created in the first place, and a supply of SCM revision numbers that have to be obtained by looking at the log of the commits for the last month... This is in no way better than a Jenkins job that can be triggered with a single button.
Likewise, a Jenkins workflow could be reliant on multiple dependant jobs, each being manually pre-configured before the build, and need artifacts uploaded from one place to another... which no receptionist will do.
So, at this point, a self-contained good buildscript that only requires ant build command to do everything from beginning to end, is just as good as a Jenkins job that only required build now... button to be pressed.
Self-contained
It is easy to think that since Jenkins will (at some point) end up calling at least a portion of a buildscript (say ant compile), that Jenkins is "compartmentalizing" the buildscript into multiple steps, thus breaking away from being self-contained.
However, instead you should zoom out by one level, and treat the whole Jenkins job configuration as a single XML file (which, by the way, can be stored and versioned through an SCM just like the buildscript)
So, at this point, it doesn't matter if the whole build logic is inside a single buildfile, or a single XML job configuration file. Both can be self-contained when done right.
The devil you know
In majority of cases, it comes down to what you know.
Some people find it easier to use Jenkins UI to visually arrange their build workflow, reporting, emailing, and archiving (and for anything that doesn't fit as wanted, find a plugin). For them, figuring out a build script language is more time consuming then simply trying it in UI.
Others prefer to know exactly what every single line of their build script does, and don't like giving control to some piece of foreign code obfuscated by UI.
Both points have merits from all sides Quality-Time-Budget triangle
The presentation
So far, things have been more or less balanced. However:
My Jenkins will email a detailed HTML report with a link to a job page and send it straight up to the (non tech-savvy) CEO. He can look at the list of latest builds, along with SCM changes for each build, linking him to JIRA issues fixed for each build (all hyperlinks to relevant places). He can select the build with the set of changes that he wants, and click "install iOS package" right off his iPad that he just used to view all this information. Meanwhile I can go to the same job page, and review the build logs and artifacts of each log, check the build time trends and compare the parameters that were used between the failing and succeeding jobs (and I didn't have to write any echos to display that, it's just all there, cause Jenkins does that for you)
With a buildscript, even if you piped the output to a file, would you send that to your (non tech-savvy) CEO? Unlikely. But wait, you know this devil very well. A few quick changes and hacks, couple Red Bulls... and months of thankless work (mostly after-hours) later... you've created a buildscript that will create and start a webserver, prepare HTML reports, collect statistics and history, email all the relevant people, and publish everything on a webpage, just like Jenkins did. (Ohh, if people could only see all the magic you did escaping and sanitizing all that HTML content in a buildscript). But wait... this only works for a single project.
So, a full case of Red Bulls later, you've managed to make it general enough to build any project, and you've created...
Another Jenkins/Bamboo/CI-server
Congratulations. Come up with a name, market it, and make some cash of it, cause this ultimate buildscript just became another CI solution a la Jenkins.
TL;DR:
Provided the CI-server can be configured simply and intuitively so that a receptionist could run the build, and provided the configuration can be self-contained (through whatever storage method the CI-server uses) and versioned in SCM, it all comes down to the Quality-Time-Budget triangle.
If you have little time and budget to learn the CI server, you can still greatly increase the quality (at least of the presentation) by embracing the CI-server's way of organizing stuff.
If you have unlimited time and budget, by all means, make your own Jenkins with the buildscript.
But considering the "unlimited" part is rather unrealistic, I would embrace the CI-server as much as possible. Yes, it's a change. However a little time invested in learning the CI-server and how it compartmentalizes or breaks into tasks the different parts of the build flow, this time spent can go a long way to increasing the quality.
Likewise, if you have no time and/or budget, figuring out the quirks of all the plugins/tasks/etc and how it all comes together will only bring your overall quality down, or even drag the time/budget down with it. In such cases, use the CI-server for bare minimum needed to trigger your existing buildscripts. However, in some cases, the "bare minimum" is no better than not using the CI-server in the first place. And when you are at this place... ask yourself:
Why do you want a CI-server in the first place?
Personally (and with today's tools), I'd take a pragmatic approach. I'd do as much as feasible on the build side (clearly better from an automation perspective), and the rest (e.g. distribution of work across machines) on the CI server. Anything that a developer might want to do on his own machine should definitely be automated on the build level. As to the concrete steps you gave, I'd generally check out code from the CI server, and deploy binaries from the build. I'd try to make every CI job look the same, invoking the build tool in the same way (e.g. gradlew ciBuild).
In Bamboo, you can add build stages and decompose the stages into tasks. Each task is something just as atomic/fundamental as an Ant task.
To some extent, this overlap in functionality is natural, as neither build tool nor CI server can assume existence of the other, and both want to provide as complete a solution as possible.
For some reason, I view this as the same problem as to whether or not to use stored procedures or put the data processing all at the application layer.
It's not an unfair comparison, and hence opinions will be as diverse, contextual, and nuanced.
Disclaimer: I'm a Gradle(ware) developer.
Jenkins says a build succeeded or failed, but can it identify the exact commit (and author!) that caused a build to fail?
This issue would seem to indicate no.
Edit: From my exchange with Pace:
What I see is "include culprits", which is everyone since the last
build. I don't want that. I want THE culprit, with Jenkins doing the
binary search. If Jenkins does two builds 10 commits apart, I don't
want 10 possible culprits, I want it to find the one.
I haven't yet heard how to do that.
That page was talking about the "find bugs" plugin, not the normal build cycle. Depending on how things are setup Jenkins can identify the exact commit and author that caused a failure. If Jenkins has the appropriate source control plugins installed and is configured to know about the repository the build is tied to then for every build it will list the changes since the last build.
In addition, Jenkins has the capability in many of its reporting plugins to blame the faulty committer. It can, for example, send an e-mail notification on a failed build to the developer that made the faulty commit.
However, many setups make it difficult for Jenkins to know. For example, if Jenkins is configured for daily builds then there are likely many commits which could have caused the issue. It's also possible that Jenkins isn't configured to know about the source control repository, or there is no source control repository. All of these issues could cause Jenkins to be unable to identify the build breaker.
Specifically for e-mailing faulty committers you can use the email-ext plugin which has options to send e-mails to everyone that committed since the last successful build.
For a humorous take on this subject check out this approach.
I think what you're asking for is impossible in some cases. Determining who the culprit is requires insight into conflict resolution that only a human can decide. Even still, sometimes a manager has to be involved in order to arbitrate. Say for instance you get 3 commits (A,B,C) that depend on a preexisting definition. However, another commit (D) modifies the behavior of that function. Which do you revert? Perhaps it's the business plan to keep A,B,C as is and return D to its original state. The opposite, modifying A,B,C to adapt to the changes of D, is also possible.
In the cases where a machine can handle the arbitration, it is the responsibility of unit tests, and static analyzers, to determine the culprit (although still imperfect). Static analyzers sometimes have built in features that email the person who committed a violation. Unit tests can be written that notify teams or team members responsible for a failed test. Both could work in the same way that identifies who was the last committer on a particular line that failed. Still, if it is a problem with linking, then perhaps some members should be associated with the particular makefile.