TFS Build takes long time - tfs

In our company we use Gated Checkin to make sure commited code doesn't have any problems and also we run all of our unit tests there. We have around 450 unit tests.
To build the entire solution it takes 15-20 seconds and for tests maybe 3 mins on my local computer. When I build it on the server it takes 10 minutes. Why is that? Is there an extra stuff that will be fired that I don't know of?

Be aware that there are additional overheads (clean/get workspace is the main culprit most of the time) in the workflow prior to the actual build and then test cycle. I have seen the same behaviour myself and never really got to a point where the performance was that close to what it would be locally.
Once the build is running, you can view the progress and see where the time is being taken, this will also be in the logs.

In the build process parameters you can skip some extra steps if you just want to build the checked in code.
Set all these to False: Clean Workspace, Label sources, Clean build, Update work items with build number.
You could also avoid publishing (if you're doing it) or copying binaries to a drop folder (also, if you're doing it).
As others have suggested, take a look at the build log, it'll tell you what's consuming the time.

Related

Does it make sense to have push trigger and nightly build together?

I'm a pretty new DevOps engineer, and i mostly deal with CI processes.
I'm wondering if it makes sense to define both nightly build And build on each push.
Seemingly, it doesn't make sense, since if the code is built after each push, why do you need to build at midnight, it was already built when you pushed it to the repository.
Am i right?
IMHO you are right - it does not make sense to have a fixed time schedule if at the same time you have a push trigger.
A reason why you still want to have a nightly build (or other fixed schedules) could be if you cannot run a full test with every build.
For example you could decide that you only do minimum tests (or smoketests) with every push triggered build, but once per day (e.g. night) you do a full test run.
As far as I know, the advantage of midnight build is that tasks with long running/deployment time can be run at midnight.
After these tasks run at midnight, you can directly view the results the next day.
In this case, you can set the condition for a specific task to control whether it runs at midnight. You could use $(Build.Reason) to judge it.
On the other hand , we recommend that you can set a specific time schedule trigger.
CI triggers cause a pipeline to run whenever you push an update to the specified branches or you push specified tags. The build is only triggered when the code changes.
Changes in the pipeline itself and the operating environment will not trigger the build.
But they can sometimes determine whether the project can run successfully.
In this case, the schedule trigger can run the build at a specific time to ensure that the project is executable.
I will share what we do and maybe help you:
We have three build tiers, one to cover a case like the Push example you pointed, other with a set of PowerShell tests, and a Scheduled one with full set of tests that takes around 5 hours.
As you can picture each case have their one scenario based on time a number of tests.

TFS 2017 gets stuck when the Visual Studio Test task tries to publish results

We have a TFS 2017 build agent executing a Visual Studio Test task to execute our unit tests. This has worked fine for several years, but all of a sudden - without any code changes - the task gets stuck.
All the tests have finished running, we see summary information, and it will sit at what appears to be the place where it would normally publish the results... but then nothing happens. We've waited 12+ hours for it to finish. This step normally takes about 90 minutes.
I've confirmed that the TRX file is being created. It's about 4MB in size. We're running a bit over 3000 unit tests.
I've also tried disabling code coverage and attachments upload inside the test task, but it doesn't appear to make a difference.
Below is a screen cap of the log output when the step is stuck.
Lastly, we have lots of other projects on this server whose tests run / publish fine, as well as TFS Releases for this same build that also run tests (integration/system tests) which work without issue.
UPDATE: We ran this build on a different build server, and it published tests correctly. So this means there is something wrong with this specific build server...
UPDATE 2: So I'm not longer sure what is happening here. The original build server we were having issues on is now working fine with no changes whatsoever. Just started working again. The other build server was working, and then stopped. Same issue. I broke up the 3000+ tests into two steps, roughly 50/50, and that worked a couple of times, but now does not. So this does not appear to be server specific, nor does it appear to be related to the quantity of tests. Debug logging offers nothing useful, as everything seems fine right up until it just stops doing anything after generating the TRX file.
UPDATE 3: Well, it's happening again. I'm not sure how to proceed. I even tried Fiddler on the build box to see if I could catch funky looking traffic, but most of the traffic I'd expect to see I don't. It's like a good chunk of the work isn't being captured (such as source downloads, reporting progress, or test result publishing) by Fiddler. Is it not over HTTP/HTTPS?
This was difficult to figure out due to the quantity of tests we're running, but I was able to narrow it down to a test that launched ping.exe:
[ExpectedException(typeof(TimeoutException))]
[TestMethod]
public void ProcessWillTimeout()
{
const string command = "cmd";
const string args = "/C ping 127.0.0.1 /t";
var externalProcessService = new ExternalProcessService();
externalProcessService.Execute(command, args, TimeSpan.FromMilliseconds(500));
}
For whatever reason, this test was leaving both conhost.exe and ping.exe "orphaned". The fact these processes were not terminating was, for an unknown reason, preventing the tests from publishing their results back to TFS. There is probably something somewhere that waits for the a process to finish and that was never happening.
Indeed, we would see a bunch of conhost.exe and ping.exe processes in both Task Manager and Process Explorer:
You'll notice the tool tip there... "[Error opening process]". I couldn't even use Process Explorer to kill these processes - although Task Manager could. Sure enough, when I killed them, the TFS build task would immediately resume and finish publishing results.
So there is clearly some kind of bug in that ExternalProcessService code we were testing (despite carefully having a finally block that terminated the process), but we are at least able to have our build tests run again without issue.
Suggest you abandon this build and trigger it again. To narrow down if this issue could be reproduced stably.
According to your description, all other builds work properly. And it worked fine for several years. All tests pass, the test report is written, but just the task hangs. Please double check if some other processes might possibly not be properly closing down.
Besides use another build agent to test again. Also try to create a newly build definition with the same settings, trigger that definition, this may do the trick.
Moreover, you could enable verbose logging for trouble shooting. To do this simply adding a build variable named system.debug and setting its value to 'true', this will contain a more detail log info.

Check file creation time in jenkins pipeline

I'm wondering if anybody knows a generic way in Jenkins pipeline to find out the creation time of a file? There are operations to touch a file but seemingly not for reading that time.
What I actually want to do is to clean out workspaces every few days - for most of our builds, we want incremental behaviour and thus don't clean out. However, there are times things build up and we'd like to be able to automatically clean out on occasion.
I need the code to work on both Linux flavours and Windows. My only big idea to date is to actually write the time into a file and read that back. However, it somehow seems wrong!

Jenkins workspaces and concurrent builds, how do they work?

I am currently learning the ins and outs of Jenkins and Pipeline.
One thing I do not yet understand is the following:
A Jenkins job by default can be executed concurrently (I can check the checkbox "Do not allow concurrent builds" if I don't want that).
What I don't understand is the following:
Let say Jenkins checks out code in /var/lib/jenkins/workspace/my-project-workspace/
Now how would it be possible to run concurrent builds without conflicts?
Let's say that build nr 1 checks out code in that path and starts testing it, and while doing that, build nr 2 is started and checks out code in that same path.
How will that not conflict with build nr 1?
I am probably missing something obvious here... Please help :)
The subdirectory inside the workspace/ folder will not always be your project name, but a (randomly) generated directory name. That's all the magic.
When this option is checked, multiple builds of this project may be executed in parallel.
By default, only a single build of a project is executed at a time — any other requests to start building that project will remain in the build queue until the first build is complete.
This is a safe default, as projects can often require exclusive access to certain resources, such as a database, or a piece of hardware.
But with this option enabled, if there are enough build executors available that can handle this project, then multiple builds of this project will take place in parallel. If there are not enough available executors at any point, any further build requests will be held in the build queue as normal.
Enabling concurrent builds is useful for projects that execute lengthy test suites, as it allows each build to contain a smaller number of changes, while the total turnaround time decreases as subsequent builds do not need to wait for previous test runs to complete.
This feature is also useful for parameterized projects, whose individual build executions — depending on the parameters used — can be completely independent from one another.
Each concurrently executed build occurs in its own build workspace, isolated from any other builds. By default, Jenkins appends "#" to the workspace directory name, e.g. "#2".
The separator "#" can be changed by setting the hudson.slaves.WorkspaceList Java system property when starting Jenkins. For example, "hudson.slaves.WorkspaceList=-" would change the separator to a hyphen.
For more information on setting system properties, see the wiki page.
However, if you enable the Use custom workspace option, all builds will be executed in the same workspace. Therefore caution is required, as multiple builds may end up altering the same directory at the same time. enter image description here

When should I "Release" my builds?

We just started using Visual Studio Release Management for one of our projects, and we're already having some problems with how we are doing things.
For now, we've created a single release stage, which is responsible for deploying our build artifacts to a dedicated virtual machine for testing. We intend to use this machine to run our integration tests later on.
Right now, we have a gated checkin build process: each checkin fires all the unit tests and we configured the release trigger to happen on this build also. At first, it seemed plausible that, after each checkin, the project was deployed and the integration tests were executed. We noticed that all released builds were polluting the console on Release Management, and that all builds were being marked as "Retain Indefinitely" and our drop folder location was growing fast (after seeing that, it makes sense that the tool automatically does this, since one could promote any build to another stage and the artifacts need to be persisted).
The question then is: what are we doing wrong? I've been thinking about this and it really does not make any sense to "release" every checkin. We should probably be starting this release process when a sprint ends, a point that can be considered a "release candidate".
If we do that though, how and when would we run our automated integration tests? I mean, a deployment process is required for running those in our case, and if we try to use other means to achieve that (like the LabTemplate build process) we will end up duplicating deployment code.
What is the best approach here?
It's tough to say without being inside your organization and looking at how you do things, but I'll take a stab.
First, I generally avoid gated checkin builds unless there's a frequent problem with broken builds. If broken builds aren't a pain point, don't use gated checkin. Why? Simple: If your build/test process takes 10 minutes to run, that's 10 minutes that I have to wait to know whether I can keep working, or if I'm going to get my changes kicked back out at me. It discourages small, frequent checkins and encourages giant, contextless checkins.
It's also 10 minutes that Developer B has to wait to grab Developer A's latest changes. If Developer B needs that checkin to keep working, that's wasted time. Trust your CI process to catch a broken build and your developers to take responsibility and fix them on the rare occasions when they occur.
It's more appropriate (depending on your branching strategy) to do a gated checkin against your trunk, and then CI builds against your dev/feature branches. Of course, that opens up the whole "how do I build once/deploy many when I have multiple branches?" can of worms. :)
If your integration tests are slow and require a deployment to succeed, they're probably not good candidates to run as part of CI. Have a CI/gated checkin build that just:
Builds
Runs fast unit tests
Runs high-priority, non-deployment-based integration tests
Then, have a second build (either scheduled, or rolling) that actually deploys and runs the whole test suite. You can schedule it according to your tastes -- I usually go with one at noon (or whatever passes for "lunch break" among the team), and one at midnight. That way you get a tested build from the morning's work, and one from the afternoon's work.
Using the Release Default Template, you can target your scheduled builds to just go as far as your "dev" (/test/integration/whatever you call it) stage. When you're ready to actually release a build, you can kick off a new release using that specific build that targets Production and let it go through all your stages normally.
Don't get tripped up on the 'Release' word. In MS Release Management (RM), creating a Release does not necessarily mean you will have this code delivered to your customers / not even that it has the quality to move out of dev. It only means you are putting a version of the code on your Release Path. This version/release can stop right in the first stage and that is ok.
Let's say you have a Release Path consisting of Dev, QA, Prod. In the course of a month, you may end up releasing 100 times in Dev, but only 5 times in QA and once in Prod.
You should drive to get each check-in deployed and integration tested. If tests takes a long time, only do the minimal during (gated or not) check-in (for example, unit tests + deployment), and the rest in your second stage of Release Path (which should be automatically triggered after first stage completes). It does not matter if second stage takes a long time. As a dev, check-in, once build completes successfully (and first stage), expect the rest to go smoothly and continue on your next task. (Note that only result of the first stage impacts your TFS build).
Most of the time, deployment and rest will run fine and so there won't be any impact to dev. Every now and then, you will have a failure in first stage, now the dev will interrupt his new work and get a resolution asap.
As for the issue that every build is kept indefinitely, for the time being, that is a side effect of RM. Current customers need to do the clean up manually (or script it). In the coming releases, a new retention policy for releases/builds will be put in place to improve this. This has not been worked on yet, but the intention would be to, for example, instruct RM to keep all releases that went to Prod, keep only the last 5 that went to QA and keep only the last 2 that went to Dev.
This is not a simple question, so also the answer must be articulated.
First of all, you will never keep all of your builds; the older a build, the less interesting to anyone; a build that doesn't get deployed in production is overtaken by builds that reaches that stage.
A team must agree on the criteria that makes a build interesting to keep around and how long to keep it. Define a policy for builds shipped to production or customers: how long do you support them? Until the next release, until the following one, for five years? Potentially shippable builds, still not in your customers' hands, are superseded by newer, so you can use a numeric or a temporal criteria (TFS implements only the first, as the second is more error-prone). Often you have more than one shippable build, when you want a safety net option and being able select from a pool which deliver (the one with more manageable bugs).
The TFS "Retain Indefinitely" should be used when you cannot automate the previous criteria, so you switch to a manually implemented policy. Indefinitely is not forever, means for an unknown time interval.

Resources