TFS 2018 to 2019 upgrade failed on missing Release Id - tfs

Using a few months old database copy of our production instrance running on TFS 2018.3 i tried to upgrade it to 2019.1.
During the upgrade of the collections, one of the collections failed on step 729.
Before i go back on load a fresh database, i would like to understand the error message and make sure we prevent it in the future.
Has anyone seen this error before and know how to fix it in my upgrade ? as well as make sure this does not happen for a future upgrade ?
[15:02:03.047] +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[15:02:03.047] ++ Executing - Operation: DistributedTaskOrchestrationToDev17M141Collection, Group: DistributedTaskOrchestrationToDev17M141Collection
[15:02:03.047] +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[15:02:03.047] Executing step: Start Queued Plans in PlanQueue
[15:02:03.047] Executing step: 'Start Queued Plans in PlanQueue' DistributedTaskOrchestration.StartThrottledPlans (729 of 858)
[15:02:03.313] [Error] VS402939: Release with ID 625 does not exist. Specify a valid ID and try again.
[15:02:03.327] Microsoft.VisualStudio.Services.ReleaseManagement.WebApi.Exceptions.ReleaseNotFoundException: VS402939: Release with ID 625 does not exist. Specify a valid ID and try again.

According to the error info
Executing step: 'Start Queued Plans in PlanQueue'
......
Microsoft.VisualStudio.Services.ReleaseManagement.WebApi.Exceptions.ReleaseNotFoundException:
VS402939: Release with ID xxx does not exist. Specify a valid ID and try again.......
It may caused by a release stuck processing in the pipeline queues.
There are few scenarios:
Requests are busy in the pipeline queue a Pipelines.They do not have
a 1:1 relationship with the agents. Looks like there are more
releases with environments in progress than the available licenses.
Clearing out requests that are in the queue: To clear out invalid
requests, the most recommended way is to cancel any in progress
deployments. Refer to
https://www.visualstudio.com/en-us/docs/build/actions/view-manage-releases#release-summary
Simply refresh the release using the refresh button and try again
which may also do the trick
Before you do the fresh database upgrade, kindly check if there are some releases hang in the queue for your collections. Check Pipelines--Agent Pools--Running Jobs. This may avoid the same issue which does not happen for a future upgrade.

We reported this same issue on the Microsoft Developer Community.
https://developercommunity.visualstudio.com/content/problem/720729/upgrade-to-tfs-20191-failed-with-error-vs402939.html
In our scenario, we created a production copy around May 2019 using TFS 2018.3, left it running until now and tried to upgrade this production copy into a TFS 2019.1. This failed.
It seems this issue was caused by our "outdated" production copy we left running which we later tried to upgrade to TFS 2019.
The upgrade failed because Releases have been in the queue for longer than 30 days, which caused parts to be deleted by the retention policy.
For production environment, its advisable to cancel any stuck
deployments before upgrade.
I will keep this in mind, but it would be good for Microsoft to make sure the upgrade process does not fail in case there is a stuck deployment somewhere.

Related

Job stuck after manual jobs deletion

I performed a manual clean by deleting job folders directly into the filesystem, and now I find a stucked running job that I cannot abort.
I've tried the answers here to force it to be stopped, but it doesn't work as it is not able to find the existing job in the system.
Additionally, when I click over the running job I get a 404 error:
"Problem accessing <route_to_job_that_doesnt_exist_anymore>"
Reason: Not found
Is there something I can do to abort this running job without restarting the server?
A way to stop a build (Like actually aborting it) is by adding a /stop at the end of the job url, behind the Build Number.
Example: http://myjenkins/project/123/stop
If this doesn't work, there is also the "Hard Kill". Instead of adding /stop you add /kill. I guess you need Admin Access for that POST action.
Don't know though if it works for jobs that don't exist on the Jenkins Host anymore due to missing filesystems

Waiting for an available agent / Waiting for an agent to be requested

(26.07.2016)I am using TFS2015 Update3 in a VM.
When I try to queue a build through the web interface or from Team Explorer, I get the following.
Then I restart all services related to TFS in services.msc and then after some time it starts working again.
So this happens too often.
I have a custom pool running:
Is there a way to debug this behaviour?
Examining the Log files
Link to Worker log file
Link to Agent log file
Exception occurs in this order here:
Checking if artifacts directory exists C:\workspaces\agent\_work\2\a
Deleting artifacts directory
System.ComponentModel.Win32Exception (0x80004005): The directory is not empty
at Microsoft.TeamFoundation.Common.FileSpec.DeleteDirectoryLongPath(String path, Boolean recursive, Boolean followJunctionPoints)
The weird thing is, queueing new build works most of the time, this happens only sporadically
It could be, that I have opened a file from that folder in notepad with many tabs open. Will observe if this issue persists and report.
If this is happening sporadically, it might a long path exists here in artifacts:
C:\workspaces\agent_work\2\a
Or, there was a cancelled build which left the artifact directory half cleaned which exposed a bug in cleaning.
The 2.x agent isn't subject to long paths (net core) but only works with 2017+:
https://github.com/Microsoft/vsts-agent
We can troubleshoot but it would be good to get to 2017+ (2018 QU3 is out) with a 2.x agent.
If that's not an option, message me and we can dig into what I think is a cancel / state bug.

Grails Controller / Integration Test succeeds but hangs forever

Absolutey stumped on this.
I have two controller integration tests that pass successfully. However, when running in Intellij or via gradle check, the JVM never exits. If I comment out the entire integration tests, the JVM exits cleanly.
When debugging any of the integration tests, I can hit pause and see that there are several threads in different states: WAITING, RUNNING, SLEEPING.
The database used in application.yml is purely an in-memory one:
url: jdbc:h2:mem:testDb;MVCC=TRUE;LOCK_TIMEOUT=10000;DB_CLOSE_ON_EXIT=FALSE
Changing this to file based does not fix the problem. Changing DB_CLOSE_ON_EXIT=TRUE does not help either.
I've tried removing #Rollback and even using #Transactional with a timeout, but that doesn't fix it.
Creating an integration test on a fresh project works with no deadlock/hanging/waiting.
I have moved back through revisions to find the changeset where this behaviour started, but the changes were purely in GSPs, Controllers and an additional assertion & test method in one of the integration tests.
The last lines in the logs are:
INFO org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext - Closing org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext#73386d72: startup date [Mon May 30 18:48:25 BST 2016]; root of context hierarchy
INFO org.springframework.context.support.DefaultLifecycleProcessor - Stopping beans in phase -2147483648
INFO org.grails.plugins.datasource.TomcatJDBCPoolMBeanExporter - Unregistering JMX-exposed beans on shutdown
INFO org.grails.plugins.datasource.TomcatJDBCPoolMBeanExporter - Unregistering JMX-exposed beans
INFO org.hibernate.tool.hbm2ddl.SchemaExport - HHH000227: Running hbm2ddl schema export
INFO org.hibernate.tool.hbm2ddl.SchemaExport - HHH000230: Schema export complete
I've tried cutting the integration test methods down to one method and the issue still occurs.
The versions I'm using are:
$ ~/apps/grails-3.1.5/bin/grails --version
|Grails Version: 3.1.5
|Groovy Version: 2.4.6
|JVM Version: 1.8.0_92
Windows 10 64bit.
Here's a thread dump.
I have no idea how to debug this further. Any ideas?
I would turn on debug level logging. Also, if you can, I would upgrade grails to something post 3.1.9. (3.1.11 is current as I write this.)
Right around grails v3.1.5 there were configuration inconsistencies between Grails and Hibernate. The grails team was quickly upgrading several interfaces, and they got through it quickly.
The result was that you didn't end up running the configuration that you thought you were. It also affected cache and transaction management.
At the time, I had to create redundant configs to make sure Grails was getting configs at one level, hibernate at another. You don't have to do this anymore, but at the time, I had to use a config like the one listed here.
To find the problem, I turned on debug logging for grails and hibernate and all of my database drivers and waded all of the way through it.
This plugin also helps with the detailed monitoring info:

sqlpackage.exe - how do I exclude synonyms>

I'm running sqlpackage.exe as park of an automated deployment script creation process, however we have synonyms in the database, which are different depending on the environment (Dev/Test/Live). The problem is that the database project has the synonyms as they are in the Dev environment, but when I run sqlpackage to compare against Test or Live, the synonyms are different and so they get scripted to be dropped and re-added to point to Dev.
I've seen on http://blogs.msdn.com/b/ssdt/archive/2015/02/23/new-advanced-publish-options-to-specify-object-types-to-exclude-or-not-drop.aspx that apparently there's a new parameter "ExcludeObjectType", but when I try running it using that parameter it gives me an error 'ExcludeObjectType' is not a valid argument for the 'Script' action (and I have the latest version of sqlpackage.exe).
Any ideas on what I can do here?
After downloading the latest SSDT for Visual Studio I still had the same issue. Next I downloaded Data-Tier Application Framework (May 2015) and used the new SqlPackage installed at C:\Program Files\Microsoft SQL Server\120\DAC\bin\sqlpackage.exe and the error went away and worked as expected.
Thank you sir! When I created the deployment script in VS, no change detected but when I tried to deploy using sqlpackage I got the error:
* The object [x] already exists in database with a different definition and will not be altered.
After adding the ExcludeObjectTypes switch I got the following error:
* 'ExcludeObjectTypes' is not a valid argument for the 'Publish' action.
But after downloading and installing latest Data-Tier App framework all works as expected with no errors.

Error Message: TF255235 While manually importing TFS2008 db into TFS2012

While I was upgrading from TFS2008 to TFS2012 I received an error stating:
Upgrade Failure: "The installation and configuration of Team
Foundation Server succeeded, however upgrading the data was
unsuccessful"
I then found that you can not rerun the upgrade wizard. How can you rerun the update?
First and foremost BACK UP YOUR DATABASES! REALLY!
At this point I turned to the tfsconfig import command. (http://msdn.microsoft.com/en-us/library/vstudio/ff407080.aspx)
I specifically ran:
TFSConfig Import /SQLInstance:TFS01 (my server name)
/CollectionName:(Anything you want) /confirmed
But I then got this error message:
Errors:1 Error Message: TF255235: Database TfsVersionControl on TFS01
does not exist but the current operation requires an existing
database.
So not only did the original upgrade not work, it also killed one of my DBs. That's fine because I have a backup. So open up SSMS and kill whatever database is pointing to the "TfsVersionControl.mdf" file. Then kill the actual mdf and ldf files.
Next, restore TfsVersionControl again from the database. At this point, we are reset back to pre-upgrade...
Now for the work around. It is an easy but ugly one. In SSMS make the user that is running TFS (in my case tfsService a sysadmin). That's it.
Go back into the command window and rerun the import. About 20 minutes later viola it worked perfectly.
Make sure that you remove the sysadmin permission from the user after everything is working perfectly.
I hope this helps someone.

Resources