Our Jenkinsfile keeps growing and reached a point where it’s hard to keep track of everything. We would like to “split” some of the parts into their own subfiles (see “include for make” https://www.gnu.org/software/make/manual/html_node/Include.html for instance).
While looking for a solution, “shared libraries” https://jenkins.io/doc/book/pipeline/shared-libraries/ keep coming up. That’s basically what I want, but it seems to address a different problem (providing common/shared functionality across different Jenkinsfiles) and also adds a lot of complexity (additional git repository). We’re simply looking for a way to decrease the complexity of one big Jenkinsfile by splitting it into multiple smaller ones.
Did I oversee the solution in my research? Or is there currently no solution to manage tasks/stages/function from the Jenkinsfile into separate modules in the same repo?
Related
I'm trying to implement a couple of services using terraform and I am not quite sure how to efficiently handle variables (ideally the proper terraform way).
Let's say I want to spin up a couple of vms in a couple of datacenters, one each and every datacenter differs slightly (think aws-regions, VPC-IDs, Securitygroup-IDs etc.)
Currently (in ansible) I have a dict that contains a dict per region containing the configuration specific to the region.
I would like to be able to deploy each datacenter on its own.
I have read through a lot of documentation and I came up with a couple of ways I could use to realise this.
1. use vars-files
have one vars-file per datacenter containing exactly the config per DC and call terraform -var-file ${file}
That somehow seems not that cool, but I'd rethink that if there was a way to dynamically load the vars-file according to the datacenter-name I set.
2. use maps
have loads of maps in an auto-loaded vars-file and reference them by data-center-name.
I've looked at this and that does not look like it's really readable in the future. It could work out if I create separate workspaces per datacenter, but since maps are string -> string only I can't use lists.
3. use an external source
Somehow that sounds good, but since the docs already label the external data source as an 'escape hatch for exceptional situations' it's probably not what I'm looking for.
4. use modules and vars in .tf-file
Set up a module that does the work, set up one directory per datacenter, set up one .tf-file per datacenter-directory that contains the appropriate variables and uses the module
Seems the most elegant, but then I don't have one central config but lots of them to keep track of.
Which way is the 'proper' way to tackle this?
To at least provide an answer to anyone else that's got the same problem:
I went ahead with option 4.
That means I've set up modules that take care of orchestrating the services, defaulting all variables I use to reflect the testing-environment (as in: If you don't specify anything extra you're setting up testing, if you want anything else you've got to override the defaults).
Then I set up three 'branches' in my directory tree, testing, staging and production, and added subdirectories for every datacenter/region.
Every region-directory contains main.tf that sources the modules, all but testing contain terraform.tfvars that define the overrides. I also have backend.tf in all of those directories that defines the backends for state-storage and locking.
I initially thought that doing it this way is a bit too complex and that I may be overengineering the problem, but it turned out that this solution is easier to understand and maintain.
I'm having difficulty convincing others in my organization to stop indiscriminately locking files on checkout. Any ideas where I can find an "official" document explaining why a checkout lock should be used sparingly? Microsoft recommends:
As a best practice, use the Lock type option with discretion and
notify your teammates why you are locking an item, and when you plan
to remove the lock.
but does not go into any details.
Anything that I could point to would be very helpful.
Although I don't have an official Microsoft source, I'm an MVP in Application Lifecycle Management, so hopefully that's enough to make this compelling. :)
Locking text files (i.e. code) on check-out can be a massive impediment to productivity. I've seen it myself when I was working at a time a co-worker wasn't, and they had an exclusive lock on a file. All of a sudden, it's thumb-twiddling time. It's even worse when you're trying to troubleshoot or fix a time-critical issue.
The most common reason why people want to lock a file for exclusive editing is because they don't want to have to perform a messy merge later on.
That is usually symptomatic of one or more things:
The files being exclusively locked are too big (one file with lots of classes in it, a "god class" that does too many things, etc). The resolution for this problem is to refactor code into smaller, more isolated classes according to the Single Responsibility Principle. Or, if you absolutely must, and you're working in the .NET world, abuse the partial keyword to split the same class up across multiple files, although I want to go on the record and state that every time I see this in a codebase it makes me cry a single tear of infinite sorrow.
The files being exclusively locked are in the midst of major, long-term refactoring. The solution here is to isolate major changes within branches, with frequent reverse-integrations of changes from the trunk back to the branch.
The person doing the change just doesn't like merges. I can't help you with that one. If you're holding onto code without committing it for a long enough time that a merge is going to be painful, you're not committing your code often enough. If you're not committing your code because it's not done yet, but the change is ongoing and you don't want to interfere with others' work, then you're not using branches properly.
Can there be times when exclusive locks against code files are good and useful? Probably, but I can't think of a problem that it addresses that can't be addressed by using other, more appropriate source control features.
Use Local workspaces if you can, since they don't enforce exclusive locks.
For me exclusive locks have become usefull when I check-in changes in *.sln or *.csproj files. Otherwise problems arrise when concurrent check-ins are performed, since VS appears to cache these files in memory without saving to disk.
I'm a little bit confused about these two options. They appear to be related. However, they're not really compatible.
For example, it seems that using Dockerfiles means that you shouldn't really be committing to images, because you should really just track the Dockerfile in git and make changes to that. Then there's no ambiguity about what is authoritative.
However, image commits seem really nice. It's so great that you could just modify a container directly and tag the changes to create another image. I understand that you can even get something like a filesystem diff from an image commit history. Awesome. But then you shouldn't use Dockerfiles. Otherwise, if you made an image commit, you'd have to go back to your Dockerfile and make some change which represents what you did.
So I'm torn. I love the idea of image commits: that you don't have to represent your image state in a Dockerfile -- you can just track it directly. But I'm uneasy about giving up the idea of some kind of manifest file which gives you a quick overview of what's in an image. It's also disconcerting to see two features in the same software package which seem to be incompatible.
Does anyone have any thoughts on this? Is it considered bad practice to use image commits? Or should I just let go of my attachment to manifest files from my Puppet days? What should I do?
Update:
To all those who think this is an opinion-based question, I'm not so sure. There are some subjective qualities to it, but I think it's mostly an objective question. Furthermore, I believe a good discussion on this topic will be informative.
In the end, I hope that anyone reading this post will come away with a better understanding of how Dockerfiles and image commits relate to each other.
Update - 2017/7/18:
I just recently discovered a legitimate use for image commits. We just set up a CI pipeline at our company and, during one stage of the pipeline, our app tests are run inside of a container. We need to retrieve the coverage results from the exited container after the test runner process has generated them (in the container's file system) and the container has stopped running. We use image commits to do this by committing the stopped container to create a new image and then running commands which display and dump the coverage file to stdout. So it's handy to have this. Apart from this very specific case, we use Dockerfiles to define our environments.
Dockerfiles are a tool that is used to create images.
The result of running docker build . is an image with a commit so it's not possible to use a Dockerfile with out creating a commit. The question is should you update the image by hand each time anything changes and thus doom yourself to the curse of the golden image?
The curse of the golden image is a terrible curse cast upon people who must continue living with a buggy security hole ridden base image to run their software on because the person who created it was long ago devoured by the ancient ones (or moved on to a new job) and nobody knows where they got the version of imagemagic that went into that image. and is the only thing that will link against the c++ module that was provided by that consultant the boss's son hired three years ago, and anyway it doesn't matter because even if you figured out where imagemagic came from the version of libstdc++ used by the JNI calls in the support tool that intern with the long hair created only exists in an unsupported version of ubuntu anyway.
Knowing both solutions advantages and inconvenient is a good start. Because a mix of the two is probably a valid way to go.
Con: avoid the golden image dead end:
Using only commits is bad if you lose track of how to rebuild your image. You don't want to be in the state that you can't rebuild the image. This final state is here called the golden image as the image will be your only reference, starting point and ending point at each stage. If you loose it, you'll be in a lot of trouble since you can't rebuild it. The fatal dead end is that one day you'll need to rebuild a new one (because all system lib are obsolete for instance), and you'll have no idea what to install... ending in big loss of time.
As a side note, it's probable that using commits over commits would be nicer if the history log would be easily usable (consult diffs, and repeat them on other images) as it is in git: you'll notice that git don't have this dilemma.
Pro: slick upgrades to distribute
In the other hand, layering commits has some considerable advantage in term of distributed upgrades and thus in bandwidth and deploy time. If you start to handle docker images as a baker is handling pancakes (which is precisely what docker permits), or want to deploy tests version instantly, you'll be happier to send just a small update in the form of a small commit rather a whole new image. Especially when having continuous integration for your customers where bug fixes should be deployed soon and often.
Try to get the best of two worlds:
In these type of scenario, you'll probably want to tag major version of your images and they should come from Dockerfiles. And you can provide continuous integration versions thanks to commits based on the tagged version. This mitigates advantages and inconvenients of Dockerfiles and layering commits scenario. Here, the key point is that you never stop keeping track of your images by limiting the number of commits you'll allow to do on them.
So I guess it depends on your scenario, and you probably shouldn't try to find a single rule. However, there might be some real dead-ends you should really avoid (as ending up with a "golden image" scenario) whatever the solution.
I have a problem, I have several versions of the same application but the process of duplicating and managing several duplicate applications is becoming very complex, each copy gets unique features by client demand.
What methods are used to simplify this process?
Do I need to have detailed documentation about every App?
I'm trying to separate the code by modules and had them according to the clients demand, am I on the correct path?
Sorry for the bad English, any question just ask, I'm always online.
This can be managed in your code revision system. Git and Mercurial allow you to manage code as "change sets". You could have a branch for each client, and have a main branch (trunk) where you add features for everybody. In the client branches, you add feature sets for individual clients. If you want to merge them back to the trunk, you can. You can also merge from the trunk to branches.
Of course, it's important to develop in a modular way in order to facilitate this approach. Also, unit tests speed things along when you have to merge.
I am pretty new to TFS and source control. I unable to understand the advantage of branching. Since i can do the same stuff by creating 2 folder main and development, when I am done with development.I can merge the code using any diff tool with the main branch.
Then whats the point of having branches ? I know there must a huge advantage but i am unable to understand.
(UPDATE: TFS now supports git for version control so the rest of this answer no longer applies)
I would google branch-per-feature.
The main advantage of branching is that you can work on a feature and not be interrupted by anyone else's work. When you are ready, you can merge and see if many features work well together or not. This is usually done as the feature is developed but for small features can be done once the feature is complete.
The advantage is that you have a clear history of what you did to implement something. Without branches, you would have a whole lot of commits mixed together with other features' commits. If QA does not pass a certain feature, you have your work cut out for you to put together another build using just the commits for the other features. The other alternative is to try and fix your feature so that QA passes. This may not be doable on a Friday afternoon.
Feature toggles are another way to omit work but this increases the complexity of code and the toggles may themselves have bugs in them. This is something to be very weary of and see how this became an "acceptable" work-around.
Branches are also used to track changes to multiple versions of releases. Products that are consumed by multiple customers may be in a situation that one set of customers is using 1.0 of the product while others are already on 2.0. If you support both, you should track changes to each by branches that are designated to them. The previous points still apply to developing for these branches.
Having said that, TFS is not ideal at branch-per-feature for a number of reasons. The biggest is that it does not support 3-way merges - it only has what is called a baseless merge. The way history is tracked, TFS cannot show you a common ancestor between the feature branch and where you are trying to merge it to. This leaves you potentially solving a lot of conflicts. In general, a lot of people that use TFS shy away from branching for this reason.
3-way merges are great because they will show you what the common ancestor is, what your changes are and what the changes in the other branch are. This will allow you to make a very educated decision on how to resolve a conflict.
If you have to use TFS, I would suggest using git-tfs to be able to take advantage of 3-way merges and many other features. Some of them include: rerere, rebasing, disconnected model, local history, bisect, and many many more.
Rebase is very useful as it allows you to alter a feature to be based off of another starting point, omit commits, squash commits together, split commits, etc. Once ready you can them merge into an integration or release branch, depending on the workflow you decide upon.
Mercurial is also another one that may be easier to use, but will not be as powerful in the long run.
If you have the opportunity, I would highly recommend moving away from TFS for source control due to a lot of limitations when compared to modern day DVCS.
Here is a nice set of guidelines to follow if you want to effectively manage branching/merging:
http://dymitruk.com/blog/2012/02/05/branch-per-feature/
Hope this helps.
There is a lot of information to read through, but there is TFS Branching Guidance located here if it helps at all - http://tfsbranchingguideiii.codeplex.com/