How do I structure Jobs in Jenkins? - jenkins

I have been tasked with setting up automated deployment and, after some research, settled on Jenkins to get the job done. Prior to this I had approximately zero knowledge of Jenkins, beyond hearing the name. I have no real knowledge of Devops beyond what I have learnt in the last couple of weeks; no formal training, no actual books, just Google searches.
We are not running a full blown/classic CI/CD process; this is a business decision. The basic requirements are:
Source code will be stored in GitHub.
Pull requests must be peer approved.
Pull requests must pass build/unit/db deploy tests.
Commits to specific branches must trigger a deployment to a related specific environment (Production, Staging or Development).
The basic functionality that I am attempting to support covers (what I currently see as) two separate processes:
On creation of a pull request, application is built, unit tests run, and db deploy tested. Status info must be passed to GitHub.
On commit to one of three specific branches (master, staging and dev) the application should be built, and deployed to one of three environments (production, staging and dev).
I have managed to cobble together a pipeline that does the first task rather well. I am using the generic web hook trigger, and manually handling all steps using a declarative pipeline stored in source control. This works rather well so far and, after much hacking, I am quite happy with the shape of it.
I am now starting work on the next bit, automated deployment.
On to my actual question(s).
In short, how do I split this up into Jobs in Jenkins?
To my mind, there are 1, 2 or 4 Jobs to be created:
One Job to Rule them All
This seems sub-optimal to me, as the pipeline will include relatively complex conditional logic and, depending on whether the Job is triggered by a Pull Request or a Commit, different stages will be run. The historical data will be so polluted as to be near useless.
OR
One job for handling pull requests
One job for handling commits
Historical data for deployments across all environments will be intermixed. I am a little concerned that I will end up with >1 Jenkinsfile in my repository. Although I see no technical reason why I can't have >1 Jenkinsfile, every example I see uses a single file. Is it OK to have >1 Jenkinsfile (Jenkinsfile_Test & Jenkinsfile_Deploy) in the repository?
OR
One job for handling pull requests
One job for handling commits to Development
One job for handling commits to Staging
One job for handling commits to Production
This seems to have some benefit over the previous option, because historical data for deployments into each environment will not be cross polluting each other. But now we're well over the >1 Jenkinsfile (perceived) limit, and I will end up with (Jenkinsfile_Test, Jenkinsfile_Deploy_Development, Jenkinsfile_Deploy_Staging and Jenkinsfile_Deploy_Production). This method also brings either extra complexity (common code in a shared library) or copy/paste code reuse, which I certainly want to avoid.
My primary objective is for this to be maintainable by someone other than myself, because Bus Factor. A real Devops/Jenkins person will have to update/manage all of this one day, and I would strongly prefer them not to suffer from my ignorance.
I have done countless searches, but I haven't found anything that provides the direction I need here. Searches for best practices make no mention on handling >1 Jenkinsfile, instead focusing on the contents of a single pipeline.

After further research, I have found an answer to my core question. This might not be the absolute correct answer, but it makes sense to me, and serves my needs.
While it is technically possible to have >1 Jenkinsfile in a project, that does not appear to align with best practices.
The best practice appears to be to create a separate repository for each Jenkinsfile, which maps 1:1 with a Job in Jenkins.
To support my specific use case I have removed the Jenkinsfile from my main source code repository. I then create 4 new repositories:
Project_Jenkinsfile_Test
Project_Jenkinsfile_Deploy_Development
Project_Jenkinsfile_Deploy_Staging
Project_Jenkinsfile_Deploy_Production
Each repository contains a single Jenkinsfile and a readme.md that, in theory, contains useful information.
This separation gives me a nice view of the historical success/failure of the Test runs as a whole, and Deployments to each environment separately.
It is highly likely that I will later create a fifth repository:
Project_Jenkinsfile_Deploy_SharedLibrary
This last repository would contain pipeline code that is shared amongst the four 'core' pipelines. Once I have the 'core' pipelines up and running properly, I will consider refactoring what I can into this shared library.
I will not accept my own answer at this point, in the hope that more answers are forthcoming.

Here's a proposal I would try for your requirements based on the experience at my last job.
Job1: builds and runs unit tests on every commit on master or whatever your main dev branch is (checks every 20 minutes or whatever suits you); this job usually finds compile and unit test issues very fast
Job2 (optional): run integration tests and various static code checks (e.g. clang-tidy, valgrind, cppcheck, etc.) every night, if the last run of Job1 was successful; this job finds usually lots of things, but probably takes lots of time, so let it run only at night
Job3: builds and tests every pull request for release branches; so you get some info in your pull requests, if its mature enough to be merged into the release branches
Job4: deploys to the appropriate environment on every commit on a release branch; on dev and staging you could probably trigger some more tests, if you have them
So Job1, Job2 and Job3 should run all the time. If pull requests to your release branches are approved by QA (i.e. reviews OK and tests successful) and merged to release branches, the deployment is done by Job4 automatically.
It depends on your requirements and your dev process, if you want to trigger Job4 only manually instead.

Related

CI strategy for embedded development - on-hardware or manual-intervention-required tests

We have a bunch of tests and are implementing CI according to git flow, using Jenkins.
Some of these tests require hardware. However, some of those tests can take 4+ hours (or even 24+ hours) to run, and require hardware that we only have 1 or 2 copies of. Some also need to be run at night.
Furthermore, a minority of tests require some limited manual intervention every few hours to swap a chip out.
I know that a common strategy is to make a test slave for the hardware tests. However, if the job takes a day or more, every time something is pushed to a pull request, that will be prohibitively expensive.
Is there a common solution to this problem? Is GitHub Flow possible within these constraints, or am I going to require release branches, and the understanding that master is not guaranteed to be release-ready at any point since it won't have these tests run?
Is there some way to trigger a specific job through GitHub to launch these expensive jobs, so that they are run only if required?
reviewing your branching strategy can be part of the solution.
I would also review my test strategy, I would execute a small set of fully automated and quick tests on master for instance and the whole batch of tests on release branch.

Complex/Orchestrated CD with AWS CodePipeline or others

Building a AWS serverless solution (lambda, s3, cloudformation etc) I need an automated build solution. The application should be stored in a Git repository (pref. Bitbucket or Codecommit). I looked at BitBucket pipelines, AWS CodePipeline, CodeDeploy , hosted CI/CD solutions but it seems that all of these do something static as in receiving a dumb signal that something changed to rebuild the whole environment.... like it is 1 app, not a distributed application.
I want to define ordered steps of what to do depending on the filetype per change.
E.g.
1. every updated .js file containing lambda code should first be used to update the existing lambda
2. after that, every new or changed cloudformation file/stack shoud be used to update or create existing ones, there may be a needed order (importing values from each other)
3. after that, code for new lambda's in .js files should be used to update the created lambda's (prev step) code.
Non updated resources should NOT be updated or recreated!
It seems that my pipelines should be ordered AND have the ability to filter input (e.g. only .js files from a certain path) and receive as input also what the name of the changed resource(s) is(are).
I dont seem to find this functionality withing AWS or hosted git solutions like BitBucket or CI/CD pipelines like CircleCI or Codeship, aws CodePipeline, CodeDeploy etc.
How come? Doesn't anyone need this? Seems like a basic requirement in my eyes....
I looked again at available AWS tooling and got to the following conclusion:
When coupling CodePipeline to CodeCommit repositry, every commit puts a whole package of the repositry on S3 as input for CodeCommit. So not only the changes but everything.
In CodePipeline there is the orchestration functionality i was looking for. You can have actions for every component like create-change-set for SAM component and execute-chage-set etc and have control over the order of all.
But:
Since all code is given as input I assume all actions in CodeCommit will be triggered even for a small change in code which does not affect 99% of the resources. Underwater SAM or CF will determine themself what did or did not change. But it is not very efficient. See my post here.
I cannot see in the pipeline overview which one was run the last time and its status...
I cannot temporary disable a pipeline or trigger with custom input
In the end I think to make a main pipeline with custom lambda code determining what actually changed using CodeCommit API and splitting all actions in sub pipelines. From the main pipeline I will push their needed input to S3 and execute them.
(i'm not allow to comment, so i'll try and provide an answer instead - probably not the one you were hoping for :) )
There is definitely a need and at Codeship we're looking into how best to support FaaS/Serverless workflows. It's been a bit of a moving target over the last years, but more common practices etc. are starting to emerge/mature to a point where it makes more sense to start codifying them.
For now, it seems most people working in this space have resorted to scripting (either the Serverless framework, or directly against the FaaS providers) but everyone's struggling with the issue of just deploying what's changes vs. deploying everything as you point to. Adding further complexity with sequencing is obviously just making things harder.
Most services (Codeship included) will allow you some form of sequenced/stepped approach to deploying, but you'll have to do all the heavy lifting of working out what has changed etc.
As to your question of How come? i think it's purely down to how fast the tooling has been changing lately combined with how few are really doing it. There's a huge push for larger companies to move to K8s and i think they've basically just drowned out the FaaS adopters. Not that it should be like that, or that we at Codeship don't want to change that; it's just how i personally see things.

Multiple pipeline jobs versus single large pipeline job

I am fairly new to Jenkins pipeline and am considering migrating an existing Jenkins batch to use pipeline script.
This may be an obvious question to those in the know but I have not been able to find any discussion of it anywhere. If you have a fairly complex set of jobs, say a few hundred, is it best practice to end up with one job with a fairly large script or a small number of jobs, probably parameterized, say 5 to 10, with smaller pipeline scripts that call each other.
Having one huge job has the severe disadvantage that you cannot easily execute the single stages anymore. On the other hand, splitting everything into different jobs has the disadvantage that many of the nice pipeline features (shared variables, shared code) cannot be used anymore. I do not think that there is a unique answer to this.
Have a look at the following two related questions:
Jenkins Build Pipeline - Restart At Stage
Run Parts of a Pipeline as Separate Job

Custom Jenkins scheduler

We're seeing a problem with Jenkins and the scheduling of builds. Specifically, we trigger Jenkins to build a pipeline of work with every push to every branch of our git repo. On its own, the whole pipeline can take from 10 to 20 minutes to build. This can cause a problem if multiple pushes to a branch happened faster than the builds are completing. Multiplied by the twenty or thirty branches that are in development.
So, I'd like to be able to automatically deprioritise any scheduled builds on Jenkins if they are triggered on a Git commit sha that is no longer the tip of its branch. This is just one example of a factor that might indicate a desired priority. Others would be that branches with open pull requests should have higher priority than those without; or manual input in order to prioritise a PR or branch that needs feedback immediately.
Is there anyway to programmatically interact with the Queue of jobs on Jenkins and reorder it?
There is the Priority Sorter Plugin, but as far as I know this assigns each build a static priority. I would like to dynamically reprioritise items in the queue based on external info (e.g. from git).
I've found reference to two other plugins whose names indicate that they might do what I want, but I can't find any meaningful documentation on them. The former doesn't provide the options it claims to, and the latter doesn't even exist in the plugins repository. Neither seems to be maintained.
My alternatives seem to be
write my own implementation of hudson.model.Queue, which seems like overkill
maintain a separate queueing service that triggers individual jobs on Jenkins, in which case what is Jenkins even for?
Am I missing something obvious? I can't be the only person who wants more fine-grained control of Jenkins build ordering.

When do you have to make a new job in Jenkins

So I want to make a build-test-deploy environment in Jenkins:
I want to do:
- a build
- a karma test
- a protractor test
- a deploy
Now a very simple but important question: Do I have to do everything in one job (what's possible) or do I have to split it up in several jobs (and cd (going) to the build directory?). So it isn't clear when I have to make a new job.
It is really a matter of taste and your exact needs.
If you do not plan running build steps individually time after time (that is, if you only care about the build as an atomic piece), or if your build flow is simple and linear, it would make more sense to stick to a single job - this way you will keep all the configuration in one place and have a good overview of build results.
If, however, there are different paths that the build process may take, or the steps themselves involve more complex logic, or, for instance, there is a need for collecting statistics on each of them, then it might be more beneficial to extract some of the steps to separate jobs and chain them together according to your rules. Jenkins is super-flexible and does not enforce any particular approach upon you.

Resources