TFS 2015 (and Octopus): Continuous integration, but wait a bit - tfs

When a pull request is completed, our (TFS 2015/Octopus- based) build system is set to do a build and deploy. The problem is, we typically have a bunch of pull requests queued up, and approving each of them triggers a build and deployment, with the unnecessary packages being created/saved and resulting emails to QA that a deployment is ready. Not a critical problem perhaps, but an annoyance to be sure.
We are using vNext build definitions. I have "Batch changes" enabled, but it's not good enough (the builds take less than a minute, reviewing and approving a pull request could take 1-30 minutes). What I would like to do is have continuous integration, but wait, say, 15 minutes after the first merge to see if any other changes are coming.
Alternatively, a scheduled build every hour, but ONLY if something has changed would suffice.
Alternatively, building every time but Octopus will only deploy after waiting a bit, that would work too.
Aside from writing my own windows service that uses the TFS REST API to trigger builds every x minutes only if something has changed, I'm not seeing a good solution. Or I've thought about saving the build packages off somewhere and writing a service to send them to Octopus only if no new packages have arrived in x minutes.
Does anyone have something like this working?

At the places where I have implemented this, I have argued for letting the QA teams decide when (or even if) to deploy. I usually implement two environments:
DEV Integration: Every continuous build gets deployed here. If deployment is successful, it becomes available to the QA Team (e-mail, build quality, etc.)
QA Environment: Where the QA Team performs its testing. The QA team chooses which build to deploy, and when.
You can set up permissions in Octopus Deploy to allow the QA team to "promote" a build from the DEV environment to the QA environment. Not sure if you have the ability to set up a second environment like that, but if you can, it will give you a lot of flexibility.

I would keep building on every checkin so that the developers have a quick feedback cycle.
To solve your problem, you could write a powershell/small app that uses the Octopus API to query what the latest release is, see if it has been deployed to QA and if not, deploy it. You can then schedule that using the windows scheduler.
Sample code using the C# Client library (not tested):
var project = _repository.Projects.FindByName("MyProject");
var qaEnv = _repository.Environments.FindByName("QA");
var release = _repository.Projects.GetReleases(project).Items.First();
var isDeployed = _repository.Releases.GetDeployments(release).Items.Any(i => i.EnvironmentId == qaEnv.Id); // Only returns the first 30 items, uses paging
if(!isDeployed)
_repository.Deployments.Create(new DeploymentResource()
{
ProjectId = project.Id,
ReleaseId = release.Id,
EnvironmentId = qaEnv.Id
})

Related

How to organize deployment process of a product from devs to testers?

We're a small team of 4 developers and 2 testers and I'm a team lead of the team. Developers do their tasks each in separate branch. Our stack is ASP.NET MVC, ASP.NET Core, Entity Framework 6, MSSQL, IIS, Windows Server. We also use Bitbucket, Jira software to store code and manage issues.
For example, there is a task "add an about window". A developer creates branch named "add-about-window" and put all the code there. Once the task is done, I do code review and in case all was good, I merge the branch into some accumulating branch let's name it "main". As a next step, I then manually deploy the updated "main" branch to test server with installed IIS, MSSQL. Once done I notify testers to test freshly uploaded app to make sure "add about window" is done correctly and works good. If testers find a bug, I have to revert the task branch merge from "main" branch and tell the developer to fix the bug in task's branch. Once the developer fixed it, I merge the branch into "main" branch again and ask testers to check again. In the end the task branch gets deleted.
This is really inconvenient, time consuming and frustrating. I have heard about git flow (maybe this is kind of what we have now).
Ideally, I would like this process to be as this:
Each developer still do work in separate branches.
Once a task is done and all the task code is in task branch I do code review.
Once code review is done and all found issues are fixed I just click "deploy"
There is a Docker image which contains IIS, MSSQL, Windows. It also with some base version of the application we work on, fully tested and stable. Let's say it's on a state of some date, like start of the year.
The Docker image is taken and a new container starts.
This Docker container gets fully initialized and then the code from a branch gets installed on the running container.
This container then has own domain name like "proj-100.branches.ourcompany.com" ("proj-100" is task's ID in Jira) which testers can go on and test.
This would definitely decrease time I spend on deployment and also will make the process more convenient and comfortable.
Can someone advice some resources I can learn about similar deployment models? Or maybe someone can share info on this. Any info will be very appreciated.
regardless of your stack, and before talking about the solutions, what you describe is the basic use case of any CI-CD process. all the exhausting manual steps you described, can be done with any CI tool.
now, let's consider what you already have, and talk about the steps for your desired solution - you're using bitbucket, which already gives you at least steps 1 and 2 - merging only approved PRs into master/main.
step 3 is where we start the CI automation process - you define a webhook upon certain actions in the bitbucket repo, which triggers a CI job/pipeline(can be a Jenkins server, gitlab-ci, or many other CI solutions). this way, you won't even need a "deploy" button, since the merging action can trigger the job, which can automatically run unit tests, integration tests(if you define them), build artifacts/docker-images and finally deploy.
step 4 needs some basic understanding of the docker containers design - a docker image is not a VM. it has its use cases and relevant scenarios, and more importantly an advised architecture guideline to follow.
to make it short, I'll only mention the principle of separation - each service should be in a separate container. it allows upscaling and easier debugging, and much more. which means - what you need is not a docker image that contains your entire system, but an orchestration of containers, each containing an independent software unit, with a clear responsibility. and here Kubernetes comes into play.
back to the CI job - after the PR merge, the job starts, running the pre-defined unit tests, building the container, and uploading to your registry.
moving to CD - depending on your process, after the updated and tested docker images are in your registry(could be artifactory/GitLab registry/docker registry...), the CD job can take any image it needs, and deploy them in your Kubernetes cluster. and that's it! the process is done.
A word of advice - if you don't have a professional DevOps team, or a good understanding of docker, CI-CD process, and Kubernetes, and if your dev team is small(and unfortunately it seems so) you may want to consider hiring a DevOps company to build the DevOps/CICD infrastructure for you, preferably with a completely managed DevOps solution and then do a handover. everything I wrote is just the guideline and basic points, to give you the big picture. good luck!
All the other answers are here great still I would like to add my piece of advice.
Recently I was also working on a product and we were three team members. It was a node.js project. If you are on AWS then you can use the AWS pipeline. This will detect pushes from a specific GitHub branch and the changes will get deployed to the server. The pipeline service has a build stage too. You can also configure slack notifications.
But you should have at least two environments production and dev to check if deployment is working properly on dev.
AWS also has services like AWS Code Commit and AWS Code Deploy.
This is just a basic solution. You don't actually need fancy software to set up ci/cd.
This kind of setup is usually supported by a CICD tool coupled with Kubernetes.
Either an on-premise one, like Jenkins+Kubernetes, or its Jenkins Kubernetes plugin, which runs dynamic agents in a Kubernetes cluster.
You can see an example in "How to Setup Jenkins Build Agents on Kubernetes Pods" by Bibin Wilson.
Or a Cloud one, like Bitbucket pipeline deploying a containerized application to Kubernetes
In both instances, the idea remains the same: create a ephemeral execution environment (a Docker container with the right components in it) for each pushed branch, in order to execute tests.
That way, said tests can take place before any merge between a feature branch and an integration branch like main.
If the tests pass, Jenkins itself could trigger an automatic merge (assuming the feature branch was rebased first on top of the target branch, main in your case)
We have similar process in our team.
We use gitlab-ci.
Hence there are some out of docker infrastructure (nginx with test stand dns),
we just create dev1, dev2 ... stands (5 stands for team of 10 developer and more than 6 microservices). For each devX stand and each microservice we have deploy to devX button in our CI-CD. And we just reserve in slack devX for feature Y on time of tests after deploy. Whan tests are done and bugs are fixed we merge to main branch and other feature brunch can be deployed and tested on devX stand.
As a next step, I then manually deploy the updated "main" branch to test server with installed IIS, MSSQL.
Once done I notify testers to test freshly uploaded app to make sure "add about window" is done correctly and works good.
If testers find a bug, I have to revert the task branch merge from "main" branch and tell the developer to fix the bug in task's branch.
If there are multiple environments then devs could deploy themselves into them. Even a single "dev" environment they can deploy to would greatly help. The devs should be able to deploy themselves and notify the testers without going through you.
That the deploy is "manual" is suspicious. How manual? Ideally it should just be a few button clicks. Sometimes you can even have it so that pushing to a branch does a deploy (through webhooks).
You should be able to deploy from branches besides main. What that means or looks like can vary a lot but the point is that if you're forcing everything there and having to revert when it doesn't work you're creating a lot of unneeded work. Ideally there should be some way to test locally. If there really can't be then you need to at least allow a way to deploy from any branch (or something like force pushing to a branch called 'dev' or something).
From another angle, unless the application gets horribly broken you don't necessarily need to rollback changes unless a release is coming soon. You can just have it fixed in another pull request.
All in all the main problem here sounds like there's only a single environment for testing, the process to deploy to it is far too manual, and the devs have no way to deploy to it themselves. This sort of thing is a massive bottle neck. Having a burdensome process to even begin to test things takes a big toll on everyone's morale -- which can be worse than the loss in velocity. You don't necesscarily need every dev to be able to spin up as many environments as they want at the push of a button but devs do need some autonomy to be able to test.
Having the application run in Docker containers can greatly help with running it locally as well as making the deployment process simpler. I've tried to stay away from specific product suggestions because this is more of a process problem it sounds like.

Rollback to last working prod deployment if post deployment health checks fail

Is there a way to configure jenkins, or any other CI/CD tools, to automatically rollback to the last working deployment, if there are any issues post deployment?
E.g, say I expect a minimum number of page views every hour. If, after a deployment, the page views drop to zero or a very low amount, can the CI/CD tool be notified (or monitor it) and automatically rollback to the previous working version (and notify the developers)?
The idea is to catch issues that might not be covered in current tests (or got missed)
We support this via Reliza Hub (new tool I'm working on). Currently we have working integrations with GitHub Actions and GitLab CI. CircleCI will come next, then Azure DevOps; Jenkins will come too but later.
Reliza Hub has a concept of approval matrix, so that your build only goes to specific environment if you assign to it specific set of approvals. If you remove approvals, it will essentially mean that previously approve build is now latest per environment. This does not replicate existing CI/CD functionality and instead provides a routing mechanism - so Reliza Hub keep metadata about which build goes where, serves as a point of truth for CI/CD tools and notifies such tools when changes occur.
In example, here is how our GitHub Actions / GitLab CI integration may look:
When builds happen, you poll Reliza for version details and then notify Reliza Hub about new builds to store metadata there: sample - https://github.com/relizaio/reliza-hub-integrations/tree/master/GitLab
As discussed above, Approval Matrix in Reliza Hub is configured in such a way to only allow releases into production which obtained specific approvals:
You configure project in Reliza Hub to trigger CI builds on specific actions, or on all release create, release update and approval / disapproval events:
During CD process, Reliza's mechanism to get latest release metadata is used - either directly or via templating functionality. Sample using templating - https://gitlab.com/taleodor/sample-helm-cd, larger write-up: https://itnext.io/building-kubernetes-cicd-pipeline-with-github-actions-argocd-and-reliza-hub-e7120b9be870 Note: You may have different triggers on different events or a combination of triggers.
You can configure programmatic approvals or disapprovals via Reliza Go Client - https://github.com/relizaio/relizaGoClient#8-use-case-programmatic-approvals-of-releases-on-reliza-hub
Essentially, for your use-case - on failing production tests you would programmatically send disapproval event to Reliza Hub, which in turn would trigger corresponding CI build and provide previous stable release details to that CI system - which would result in expected roll-back to last stable state.
If all the above sounds reasonable and you're ok to work with either GitHub Actions or GitLab CI or (a little later) CircleCI as other CIs come later, feel free to contact me at https://twitter.com/taleodor - I will help you configure this.

Jenkins Promoted build

I don't understand what Promoted build really is and how it works. Can someone please explain to me like to a 10 years old kid. If you can provide some sample examples would help me a lot.
Thanks
In a typical software developing organization with CI system, there are 10's or 100's of continuous builds daily. Only one of those builds (usually the latest stable) is selected and "promoted" to be a Release Candidate (RC), which goes to the next quality gate - usually the QA department. Then, they select one of those RC's (others are dropped) and again, "promote" it to the next level - either to staging environment, validation etc. Then, finally... one of these builds is again "promoted" to be an official release.
Why is that important?
Visibility: You would want to distinguish many "regular", continuous builds from few, selected "RC" builds.
Retention: If you commit often (which is the best practice), you will likely get lot of daily builds, and would like to implement a retention policy (e.g. only keep last 100 builds or only builds from the last 7 days). You will then want to make sure promoted builds (RCs) are locked against retention. This is mostly important if you deploy binaries to customers, and may need the exact binary to reproduce an escaping bug in the future (though you still have the source code in the repository, I've seen cases where escaping bugs relate to the build process rather than the source code - due to rapid changes to the build process, or time-of-build sensitive data like digital signatures).
Permissions: you may want to prevent access to builds with "half baked" features from non-developers.
Binary Repositories: you may want to publish only meaningful builds to an external binary repository.
Builds in Jenkins can be "promoted" either manually or automatically, using plugins like Promoted Builds Plugin. You can also create your entire "promotion" workflow using pipeline scripts. Here's an example:
a "Continuous" job that polls SCM and builds on every change. It has a retention policy to keep only the last 50 builds. Access is restricted only to developers;
a "Release Candidates" job that copies artifacts from a manually selected build (using parameters). Access is allowed to QA testers;
a "Releases" jobs that copies artifacts from a manually selected RC. Access is allowed to the entire organization. Binaries are released to external/public repository.
I hope this answers your question :-)

When should I "Release" my builds?

We just started using Visual Studio Release Management for one of our projects, and we're already having some problems with how we are doing things.
For now, we've created a single release stage, which is responsible for deploying our build artifacts to a dedicated virtual machine for testing. We intend to use this machine to run our integration tests later on.
Right now, we have a gated checkin build process: each checkin fires all the unit tests and we configured the release trigger to happen on this build also. At first, it seemed plausible that, after each checkin, the project was deployed and the integration tests were executed. We noticed that all released builds were polluting the console on Release Management, and that all builds were being marked as "Retain Indefinitely" and our drop folder location was growing fast (after seeing that, it makes sense that the tool automatically does this, since one could promote any build to another stage and the artifacts need to be persisted).
The question then is: what are we doing wrong? I've been thinking about this and it really does not make any sense to "release" every checkin. We should probably be starting this release process when a sprint ends, a point that can be considered a "release candidate".
If we do that though, how and when would we run our automated integration tests? I mean, a deployment process is required for running those in our case, and if we try to use other means to achieve that (like the LabTemplate build process) we will end up duplicating deployment code.
What is the best approach here?
It's tough to say without being inside your organization and looking at how you do things, but I'll take a stab.
First, I generally avoid gated checkin builds unless there's a frequent problem with broken builds. If broken builds aren't a pain point, don't use gated checkin. Why? Simple: If your build/test process takes 10 minutes to run, that's 10 minutes that I have to wait to know whether I can keep working, or if I'm going to get my changes kicked back out at me. It discourages small, frequent checkins and encourages giant, contextless checkins.
It's also 10 minutes that Developer B has to wait to grab Developer A's latest changes. If Developer B needs that checkin to keep working, that's wasted time. Trust your CI process to catch a broken build and your developers to take responsibility and fix them on the rare occasions when they occur.
It's more appropriate (depending on your branching strategy) to do a gated checkin against your trunk, and then CI builds against your dev/feature branches. Of course, that opens up the whole "how do I build once/deploy many when I have multiple branches?" can of worms. :)
If your integration tests are slow and require a deployment to succeed, they're probably not good candidates to run as part of CI. Have a CI/gated checkin build that just:
Builds
Runs fast unit tests
Runs high-priority, non-deployment-based integration tests
Then, have a second build (either scheduled, or rolling) that actually deploys and runs the whole test suite. You can schedule it according to your tastes -- I usually go with one at noon (or whatever passes for "lunch break" among the team), and one at midnight. That way you get a tested build from the morning's work, and one from the afternoon's work.
Using the Release Default Template, you can target your scheduled builds to just go as far as your "dev" (/test/integration/whatever you call it) stage. When you're ready to actually release a build, you can kick off a new release using that specific build that targets Production and let it go through all your stages normally.
Don't get tripped up on the 'Release' word. In MS Release Management (RM), creating a Release does not necessarily mean you will have this code delivered to your customers / not even that it has the quality to move out of dev. It only means you are putting a version of the code on your Release Path. This version/release can stop right in the first stage and that is ok.
Let's say you have a Release Path consisting of Dev, QA, Prod. In the course of a month, you may end up releasing 100 times in Dev, but only 5 times in QA and once in Prod.
You should drive to get each check-in deployed and integration tested. If tests takes a long time, only do the minimal during (gated or not) check-in (for example, unit tests + deployment), and the rest in your second stage of Release Path (which should be automatically triggered after first stage completes). It does not matter if second stage takes a long time. As a dev, check-in, once build completes successfully (and first stage), expect the rest to go smoothly and continue on your next task. (Note that only result of the first stage impacts your TFS build).
Most of the time, deployment and rest will run fine and so there won't be any impact to dev. Every now and then, you will have a failure in first stage, now the dev will interrupt his new work and get a resolution asap.
As for the issue that every build is kept indefinitely, for the time being, that is a side effect of RM. Current customers need to do the clean up manually (or script it). In the coming releases, a new retention policy for releases/builds will be put in place to improve this. This has not been worked on yet, but the intention would be to, for example, instruct RM to keep all releases that went to Prod, keep only the last 5 that went to QA and keep only the last 2 that went to Dev.
This is not a simple question, so also the answer must be articulated.
First of all, you will never keep all of your builds; the older a build, the less interesting to anyone; a build that doesn't get deployed in production is overtaken by builds that reaches that stage.
A team must agree on the criteria that makes a build interesting to keep around and how long to keep it. Define a policy for builds shipped to production or customers: how long do you support them? Until the next release, until the following one, for five years? Potentially shippable builds, still not in your customers' hands, are superseded by newer, so you can use a numeric or a temporal criteria (TFS implements only the first, as the second is more error-prone). Often you have more than one shippable build, when you want a safety net option and being able select from a pool which deliver (the one with more manageable bugs).
The TFS "Retain Indefinitely" should be used when you cannot automate the previous criteria, so you switch to a manually implemented policy. Indefinitely is not forever, means for an unknown time interval.

TFS Build Dependencies

In TeamCity, you can create build dependencies where one build will not start until another finishes successfully. Is that possible with TFS 2012? Where can I find more information about how to set that up?
The short answer is that TFS doesn't have equivalent functionality, but you can achieve the same goals with a little work.
A common scenario I encounter is a team wants to do a build when they check-in that does some quick stuff (compile, fast unit tests), then immediately after wants to do another build that runs some slower stuff (integration tests, test deployments, etc). I do this often with my teams, and I'll setup a Gated Build that runs in say 5 mins, then have a CI build that is kicked off as soon as the Gated Build checks-in, which may take an hour to run. I like this approach as it gets the developers some feedback quickly, then more detailed feedback shortly thereafter.
Another supported scenario is having a build explicitly kickoff it's dependencies. If you look a the Lab Build Template it does exactly this, it will first kickoff the application TFS Build, and the Lab Build will sit and wait for it to finish, then the Lab Build will continue. In theory you could have Build A kickoff build B which kicks off C & D, etc.
If your needs are more complex than that (e.g. you have multiple applications that you have a build for each, then a Product that includes some applications that needs to be built after each application changes, then maybe a Product Suite build that needs to kickoff whenever a Product changes - this is the scenario I dealt with). I basically implemented a custom build dependency system to handle this. We made an XML file that described the build dependencies, then wrote a simple ISubscriber plug-in that we would deploy to TFS, and it would listen for Build Completed events then consult the dependency config and kickoff the appropriate build(s).

Resources