TFS Releases keep getting stuck "In Progres" - tfs

I am using TFS 2015 on-premise for all our builds and releases. Recently some of our releases have been taking far longer than normal to complete. A release that normally took 20 minutes has been taking several hours.
We release to both on-premise web servers and Azure, and we get the problem with both. It doesn't happen with every release, which makes diagnosing the problem very difficult. We've re-started the TFS server but it still keeps happening.
The on-premise releases are just a Windows Machine File Copy task and Azure releases are an Azure Web App Deployment task. All pretty straight forward. The problem only started recently (over a week or so ago). There have been no updates or changes to the TFS server, so can't fathom why this has started happening.

Related

Upgrade from TFS 2017 to Azure DevOps - build/release pipeline

My company is considering upgrading our on prem TFS 2017 update 3 to the latest Azure DevOps Server (notably, the on prem variety).
During discussions about that possibility, one key stakeholder claimed that if you upgrade, all of your build and release pipelines would have to be rebuilt from scratch. We have a healthy number of build and release definitions in TFS 2017.
I have looked for the answer in the Microsoft documentation about what exactly gets upgraded, but unfortunately I can't get the level of granularity which would prove or disprove the above claim. On the surface it would seem like a horrible upgrade story if it were true. But I also understand that designs and architectures change and upgrades aren't always possible.
Could somebody let me know whether the build and release pipelines can survive the upgrade more or less unscathed? Knowing this would be a valuable data point as we work toward a decision.
Thanks in advance!
The vNext build definitions and the release pipeline I would expect would be pretty lift and shift. Depending on the tasks that you have defined, they might no longer be supported or there might be new versions. The UI will let you know that new versions are available.
A lot of the new focus is building out the features for the YAML build definitions. If you want to leverage those, you'd have to do a lot more rework of converting those vNext tasks into YAML. But converting is not really a hard requirement.
You mentioned that you aren't using the XAML build definitions, but if you happened to be using them, I would image that is where a lot of the rework comes in. Having done that in the past, I can say it is a pain if you have to do it.
all of your build and release pipelines would have to be rebuilt from scratch.
I've tested it and it won't lose any data after upgrading. We should use scheduled backups to ensure that we always have backups in place in case something goes wrong.
we can use that new hardware to do a dry run first, and then we will wipe everything clean and use it again for the production upgrade.
For our dry run, the steps for our upgrade will be:
Copy recent database backups to our new SQL instance.
Install TFS 2015 on our new application tier.
Use scheduled backups to restore the database backups.
Run through the upgrade wizard, being sure to use a service account which does not have any permissions in our production environment. See Protecting production in the dry run in pre-production document for more information.
Optionally configure new features which require changes to our existing projects.
The production upgrade steps will be quite similar. There the steps will be:
Take the production server offline using TFSServiceControl's quiesce command. The goal here is to ensure that the backups we use to move to our new hardware are complete and we don't lose any user data.
Take new backups of each database.
Copy the backups to our new SQL instance.
Install TFS 2015 on our new application tier.
Use the scheduled backups wizard to restore the database backups.
Run through the upgrade wizard, using our desired production service account.
Optionally configure new features which require changes to our existing projects.
You can refer to this doc for more details.

TFS 2018.2 upgrade

I went to do an upgrade of TFS2015 to TFS2018.2.
In past tests and experiences, these upgrades take some time against the collections if they are on the larger side. I first started a POC with TFS2015 to TFS2017.3 and that took 24 hours total to do the upgrade and it was successfully completed. As I was doing the POC TFS2018.2 RTM, so I went with the same POC but this time I went from TFS2015 to TFS2018.2, that upgrade only took about an hour or so, which seems really odd.
It looks like it completed without error but the amount of time it took to upgrade the collections seemed really off, compared to any other upgrade I have done in the past. How long is the upgrade process expected to take?
The length of time to upgrade the collections depends on the amount of changes made to the database, and which tables changes were made to, and the size of the team project collection databases.
I recently performed an upgrade against a 500 GB collection from 2015 to 2018 and the entire upgrade process took 20 minutes. It's fast.
Previous versions to 2015 were slow because TFS 2015 introduced significant database schema changes. Nothing after that has required such a major schema change.

TFS 2015 (and Octopus): Continuous integration, but wait a bit

When a pull request is completed, our (TFS 2015/Octopus- based) build system is set to do a build and deploy. The problem is, we typically have a bunch of pull requests queued up, and approving each of them triggers a build and deployment, with the unnecessary packages being created/saved and resulting emails to QA that a deployment is ready. Not a critical problem perhaps, but an annoyance to be sure.
We are using vNext build definitions. I have "Batch changes" enabled, but it's not good enough (the builds take less than a minute, reviewing and approving a pull request could take 1-30 minutes). What I would like to do is have continuous integration, but wait, say, 15 minutes after the first merge to see if any other changes are coming.
Alternatively, a scheduled build every hour, but ONLY if something has changed would suffice.
Alternatively, building every time but Octopus will only deploy after waiting a bit, that would work too.
Aside from writing my own windows service that uses the TFS REST API to trigger builds every x minutes only if something has changed, I'm not seeing a good solution. Or I've thought about saving the build packages off somewhere and writing a service to send them to Octopus only if no new packages have arrived in x minutes.
Does anyone have something like this working?
At the places where I have implemented this, I have argued for letting the QA teams decide when (or even if) to deploy. I usually implement two environments:
DEV Integration: Every continuous build gets deployed here. If deployment is successful, it becomes available to the QA Team (e-mail, build quality, etc.)
QA Environment: Where the QA Team performs its testing. The QA team chooses which build to deploy, and when.
You can set up permissions in Octopus Deploy to allow the QA team to "promote" a build from the DEV environment to the QA environment. Not sure if you have the ability to set up a second environment like that, but if you can, it will give you a lot of flexibility.
I would keep building on every checkin so that the developers have a quick feedback cycle.
To solve your problem, you could write a powershell/small app that uses the Octopus API to query what the latest release is, see if it has been deployed to QA and if not, deploy it. You can then schedule that using the windows scheduler.
Sample code using the C# Client library (not tested):
var project = _repository.Projects.FindByName("MyProject");
var qaEnv = _repository.Environments.FindByName("QA");
var release = _repository.Projects.GetReleases(project).Items.First();
var isDeployed = _repository.Releases.GetDeployments(release).Items.Any(i => i.EnvironmentId == qaEnv.Id); // Only returns the first 30 items, uses paging
if(!isDeployed)
_repository.Deployments.Create(new DeploymentResource()
{
ProjectId = project.Id,
ReleaseId = release.Id,
EnvironmentId = qaEnv.Id
})

Disable or change Reporting in TFS 2013 when configured server missing/offline

Like many other out there I had the fun task of upgrading or TFS 2008 server to a brand-new TFS 2013 install.
The good news -> this has been done and documented. The bad news -> you have to migrate to TFS 2012 and then Migrate from 2012 to 2013.
All things said it mostly went fairly smooth. I cannot really complain. There is one hitch, however. Or plan was to use an intermediate server (SQLTFS01) for the TFS and SQL Server 2012 install and then most everything onto our destination server for 2013 (SQL008). Then we were to take SQLTFS01 offline and re purpose that machine.
In the end there was a missed step. It seems that our final install of TFS2013 is still pointed to SQLTFS01 for the reporting services components. See here:
Attempts to disable the reporting and analysis services portion of the server are all failing because even in order to disable the tool, it tries to connect to the existing tool.
Question: How can we disable this feature or redirect this stuff? Can we do it though setting files that I am not aware of?
Thanks,
Tom
I would recommend that you "unconfigure" your application tier by running "tfsconfig setup /uninstall:all". This will nit touch any of your data but will reset your app tier to the state before you ran the configuration.
You can then follow the steps in the "move to new hardware" documentation so that you don't miss any of the steps:
http://msdn.microsoft.com/en-us/library/ms404869.aspx
If you start from after "restore databases" step you should be good.

Speed Up Visual Studio Publish Web Process

I am using Visual Studio 2012 Professional to deploy my ASP.Net MVC website.
The problem is that when I use the one click publish feature, my web application comes to a screeching halt and it takes about 5 minutes for the website to respond normally again.
What are some things I can do to speed up this process to reduce or eliminate the amount of downtime for my users?
I publish to a separate copy of the site on the live server and I have a script which I then run on the server to compare the live site with the copy and only update new/changed files/directories and delete removed ones. This cuts down the downtime quite a bit, especially if there have only been minor changes.

Resources