TFS2010 build queue sometimes has multiple InProgress builds and sometimes not - what is the logic? - tfs

My fairly-large project uses gated builds, with a heavily-customized process template (XAML). For reasons beyond the scope of this question, our process has a single SharedResourceScope, so multiple builds don't run in parallel (I know you're supposed to do that with an Agent scope, but we switch agents in the middle, so wrapped everything with the SharedResourceScope).
Now, if there are several check-ins in queue, all of them go into "In Progress" state, and all but one wait on the SharedResourceScope. This means that:
People can't know which build is actually running
Even if I set a new queued check-in to be high-priority, it can't overtake all those who are in progress and waiting on SharedResourceScope, so the whole priority setting has little meaning.
I've experimented using DefaultProcessTemplate.XAML, and I see that usually only one build is In Progress (though Occasionally I see 2 builds).
Questions:
When exactly does a build start, and therefore goes into "In progress" mode? What prevents all builds from starting immediately, and blocking on AgentScope / SharedResourceScope?
Is there something I can author in my XAML to prevent all builds to go in progress?

A build starts when the build controller has capacity to start a new build. Since builds can use 0 to n agents, the controller doesn't wait to get an agent before the build starts. The controller determines its capacity based on the "Maximum number of concurrent builds" setting on the controller properties (in code: MaxConcurrentBuilds).
Default setting: "Default to number of agents".
No way in XAML to control this behavior.
Also, there's a bug in TFS2010 regarding this, hotfix: KB2567437

Using TFS Build Extensions, you get a number of activities with which you can administer agent control through the process: QueryBuildAgentStatus, IsBuildRunning, SetBuildAgentStatus to name a few that could be useful to you. They're fairly undocumented as of now, so you'll have to experiment to get them running. But there is a help file included in the package that is useful.
As for the "In Progress" mode, I've noticed that this is the status even if a build is waiting for a free agent; check the log of such a build and you'll see. This is quite confusing, and I hope that MS will add a "Queued (but not started)" mode.

Related

How can aborted builds be ignored concerning the Jenkins job status preview?

For internal reasons, one of my jobs is able to run concurrently, but new builds abort themselves if another build is already running (disabling concurrency doesn't help, since I don't want new jobs to be scheduled for execution once the current build is done).
However, this behaviour is detrimental to the job status preview (the colored ball next to the job name when inside the job's parent folder). It often shows the status as "aborted", which is undesirable - I want to view the latest running build as the source of the job status.
I tried deleting aborted builds from within their own execution, but that's unfortunately neither trivial nor stable, and thus not suitable for this situation. I could probably get a workaround running that deletes them from a separate job, but that's not ideal either.
Anyway, I'm now hoping that I can just tell Jenkins to ignore "aborted" builds in the calculation of the job preview. Unfortunately, I wasn't able to find a setting or plugin that allows me to do this.
Is this possible at all? Or do I need to find another way?
Would something like this help?
https://plugins.jenkins.io/build-blocker-plugin/
I haven't used it myself but it supports blocking builds completely if a build is already running.

Is there a standard way to delete successful vnext builds (PR) just after their completion?

The most aggressive build retention policy one can set for pull request builds is described in "Clean up pull request builds"
a policy that keeps a minimum of 0 builds
Still, it means that successful PR builds (with artifacts no one will ever need) will be deleted only after the next automatic retention cleanup - usually the next day, but in reality it results in nearly two days worth of no longer needed builds.
In our particular case it seems to be desirable to find a way to clean successful PR builds ASAP due to their frequency and artifact's sheer size that may periodically strain our not yet fully organized infrastructure dedicated to PR handling (it will be significantly improved, but not as soon as we'd like to, and those successful PR builds would still remain no less of a dead weight).
And as far as I see the only way to do it would be to delete builds manually.
While it is not too difficult to implement, I'd still like to check whether there is a simpler standard way to delete successful PR builds automatically.
P.S.: There is one particularity in our heavily customized build process - we have multiple dependent artifacts. Like create A, use it to build B, create C to test B... So trying not to Publish artifacts on overall successful build with custom condition like it is suggested below is not exactly feasible.
Let's look at the problem from a different perspective: The problem isn't that builds are retained, the problem is that your PR builds are publishing artifacts.
You can make the Publish Artifacts steps conditional so that they don't run during PRs. Something like and(succeeded(), ne(variables['Build.Reason'], 'PullRequest')) will make the task only run if it's not a PR.

Jenkins workspaces and concurrent builds, how do they work?

I am currently learning the ins and outs of Jenkins and Pipeline.
One thing I do not yet understand is the following:
A Jenkins job by default can be executed concurrently (I can check the checkbox "Do not allow concurrent builds" if I don't want that).
What I don't understand is the following:
Let say Jenkins checks out code in /var/lib/jenkins/workspace/my-project-workspace/
Now how would it be possible to run concurrent builds without conflicts?
Let's say that build nr 1 checks out code in that path and starts testing it, and while doing that, build nr 2 is started and checks out code in that same path.
How will that not conflict with build nr 1?
I am probably missing something obvious here... Please help :)
The subdirectory inside the workspace/ folder will not always be your project name, but a (randomly) generated directory name. That's all the magic.
When this option is checked, multiple builds of this project may be executed in parallel.
By default, only a single build of a project is executed at a time — any other requests to start building that project will remain in the build queue until the first build is complete.
This is a safe default, as projects can often require exclusive access to certain resources, such as a database, or a piece of hardware.
But with this option enabled, if there are enough build executors available that can handle this project, then multiple builds of this project will take place in parallel. If there are not enough available executors at any point, any further build requests will be held in the build queue as normal.
Enabling concurrent builds is useful for projects that execute lengthy test suites, as it allows each build to contain a smaller number of changes, while the total turnaround time decreases as subsequent builds do not need to wait for previous test runs to complete.
This feature is also useful for parameterized projects, whose individual build executions — depending on the parameters used — can be completely independent from one another.
Each concurrently executed build occurs in its own build workspace, isolated from any other builds. By default, Jenkins appends "#" to the workspace directory name, e.g. "#2".
The separator "#" can be changed by setting the hudson.slaves.WorkspaceList Java system property when starting Jenkins. For example, "hudson.slaves.WorkspaceList=-" would change the separator to a hyphen.
For more information on setting system properties, see the wiki page.
However, if you enable the Use custom workspace option, all builds will be executed in the same workspace. Therefore caution is required, as multiple builds may end up altering the same directory at the same time. enter image description here

Jenkins Trigger X minutes after last build

We have recently switched from CruiseControl.Net to Jenkins for managing our builds. With CCNET it would trigger a new build X minutes after the last one completed, but with Jenkins it is constantly dropping builds in the queue, not allowing any time in between the two. We'd prefer the CCNET method.
I don't see how this can be done with the Scheduler Trigger, it seems to be all date time based.
I don't see any setting to prevent Queueing another build where the build is currently running.
I don't see a trigger that would allow for timing based off last run.
How could I manage this?
Jenkins lets you set a quiet period between builds, which can be set at the system level and overridden at the job level. Here's the help text from Jenkins:
If set, a newly scheduled build waits for this many seconds before actually being built. This is useful for:
Collapsing multiple CVS change notification e-mails into one (some CVS changelog e-mail generation scripts generate multiple e-mails in quick succession when a commit spans across directories).
If your coding style is such that you commit one logical change in a few cvs/svn operations, then setting a longer quiet period would prevent Jenkins from building it prematurely and reporting a failure.
Throttling builds. If your Jenkins installation is too busy with too many builds, setting a longer quiet period can reduce the number of builds.
If not explicitly set at project-level, the system-wide default value is used.
And here is a more detailed discussion.

When should I "Release" my builds?

We just started using Visual Studio Release Management for one of our projects, and we're already having some problems with how we are doing things.
For now, we've created a single release stage, which is responsible for deploying our build artifacts to a dedicated virtual machine for testing. We intend to use this machine to run our integration tests later on.
Right now, we have a gated checkin build process: each checkin fires all the unit tests and we configured the release trigger to happen on this build also. At first, it seemed plausible that, after each checkin, the project was deployed and the integration tests were executed. We noticed that all released builds were polluting the console on Release Management, and that all builds were being marked as "Retain Indefinitely" and our drop folder location was growing fast (after seeing that, it makes sense that the tool automatically does this, since one could promote any build to another stage and the artifacts need to be persisted).
The question then is: what are we doing wrong? I've been thinking about this and it really does not make any sense to "release" every checkin. We should probably be starting this release process when a sprint ends, a point that can be considered a "release candidate".
If we do that though, how and when would we run our automated integration tests? I mean, a deployment process is required for running those in our case, and if we try to use other means to achieve that (like the LabTemplate build process) we will end up duplicating deployment code.
What is the best approach here?
It's tough to say without being inside your organization and looking at how you do things, but I'll take a stab.
First, I generally avoid gated checkin builds unless there's a frequent problem with broken builds. If broken builds aren't a pain point, don't use gated checkin. Why? Simple: If your build/test process takes 10 minutes to run, that's 10 minutes that I have to wait to know whether I can keep working, or if I'm going to get my changes kicked back out at me. It discourages small, frequent checkins and encourages giant, contextless checkins.
It's also 10 minutes that Developer B has to wait to grab Developer A's latest changes. If Developer B needs that checkin to keep working, that's wasted time. Trust your CI process to catch a broken build and your developers to take responsibility and fix them on the rare occasions when they occur.
It's more appropriate (depending on your branching strategy) to do a gated checkin against your trunk, and then CI builds against your dev/feature branches. Of course, that opens up the whole "how do I build once/deploy many when I have multiple branches?" can of worms. :)
If your integration tests are slow and require a deployment to succeed, they're probably not good candidates to run as part of CI. Have a CI/gated checkin build that just:
Builds
Runs fast unit tests
Runs high-priority, non-deployment-based integration tests
Then, have a second build (either scheduled, or rolling) that actually deploys and runs the whole test suite. You can schedule it according to your tastes -- I usually go with one at noon (or whatever passes for "lunch break" among the team), and one at midnight. That way you get a tested build from the morning's work, and one from the afternoon's work.
Using the Release Default Template, you can target your scheduled builds to just go as far as your "dev" (/test/integration/whatever you call it) stage. When you're ready to actually release a build, you can kick off a new release using that specific build that targets Production and let it go through all your stages normally.
Don't get tripped up on the 'Release' word. In MS Release Management (RM), creating a Release does not necessarily mean you will have this code delivered to your customers / not even that it has the quality to move out of dev. It only means you are putting a version of the code on your Release Path. This version/release can stop right in the first stage and that is ok.
Let's say you have a Release Path consisting of Dev, QA, Prod. In the course of a month, you may end up releasing 100 times in Dev, but only 5 times in QA and once in Prod.
You should drive to get each check-in deployed and integration tested. If tests takes a long time, only do the minimal during (gated or not) check-in (for example, unit tests + deployment), and the rest in your second stage of Release Path (which should be automatically triggered after first stage completes). It does not matter if second stage takes a long time. As a dev, check-in, once build completes successfully (and first stage), expect the rest to go smoothly and continue on your next task. (Note that only result of the first stage impacts your TFS build).
Most of the time, deployment and rest will run fine and so there won't be any impact to dev. Every now and then, you will have a failure in first stage, now the dev will interrupt his new work and get a resolution asap.
As for the issue that every build is kept indefinitely, for the time being, that is a side effect of RM. Current customers need to do the clean up manually (or script it). In the coming releases, a new retention policy for releases/builds will be put in place to improve this. This has not been worked on yet, but the intention would be to, for example, instruct RM to keep all releases that went to Prod, keep only the last 5 that went to QA and keep only the last 2 that went to Dev.
This is not a simple question, so also the answer must be articulated.
First of all, you will never keep all of your builds; the older a build, the less interesting to anyone; a build that doesn't get deployed in production is overtaken by builds that reaches that stage.
A team must agree on the criteria that makes a build interesting to keep around and how long to keep it. Define a policy for builds shipped to production or customers: how long do you support them? Until the next release, until the following one, for five years? Potentially shippable builds, still not in your customers' hands, are superseded by newer, so you can use a numeric or a temporal criteria (TFS implements only the first, as the second is more error-prone). Often you have more than one shippable build, when you want a safety net option and being able select from a pool which deliver (the one with more manageable bugs).
The TFS "Retain Indefinitely" should be used when you cannot automate the previous criteria, so you switch to a manually implemented policy. Indefinitely is not forever, means for an unknown time interval.

Resources