TFS 2010 Build: Sporadic failure in the process - tfs

We have a situation where our builds have stopped executing in a stable manner.
At a rate of about one every three we receive either TF215096 or TF215097 errors & the Build fails.
If we then restart the Build controller, it works again - until next time.
The errors we get are:
TF215096: An error occurred while connecting to controller vstfs:///Build/Controller/1: There was no endpoint listening at ht*p://XXXX that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.
TF215096: An error occurred while connecting to controller XXX - Controller: Could not connect to ht*p://XXX. TCP error code 10061: No connection could be made because the target machine actively refused it 192.168.XXX.XXX:XXX.
TF215097: An error occurred while initializing a build for build definition \XXX: Team Foundation services are not available from server ht*p://XXX. Technical information (for administrator): The underlying connection was closed: A connection that was expected to be kept alive was closed by the server.
TF215097: An error occurred while initializing a build for build definition \YYY: An error occurred while receiving the HTTP response to ht*p://XXX. This could be due to the service endpoint binding not using the HTTP protocol. This could also be due to an HTTP request context being aborted by the server (possibly due to the service shutting down). See server logs for more details.
Server logs provide with little info, at least we 've found nothing that helps us resolve the situation. Various searches in the Net were also not productive.
Does anybody had these/similar issues? Any ideas on how/where to look for a resolution?
Thank you very much in advance for any input!

Yeah it does sound like you have some connectivity issues. You can try enabling SOAP tracing on both the build machine and the server (if possible) to see if there is any error. If it still does not give you any new information, contact Microsoft by filing a Connect Bug to get help.

I am not sure if it will help you but I have ran into similar issues with build agents and ended up just deleting and re-creating the agent. You may try deleting your controller/agent and adding it back in. A brute-force solution but a good starting point. If that doesn't resolve the issue at least you can eliminate the controller/agent as the issue and take a look at network/server related issues.

Today is a happy day, since we managed to get to the bottom of the matter. Sorry #Duat that I'm taking away the 'answer' checkmark - but it turned out that the problem was quite different from what you (and anybody else) has predicted.
In my last update I was about to forward the matter to MS, when we realized that our Firewall was misbehaving in the name resolution. So we assumed this was the culprit & awaited for this to resolve. After this was resolved, we STILL had the same issues and we went again re-examining the situation.
We isolated the problem within our Build Process, more specific with a custom code activity included in our build solution.
I had implemented a code activity that would kick in at the final steps of every build. This activity was about gathering BuildDetails about the running build & add them as a new line in a 'BuildLog.xls'. Implementation made use of Microsoft.Office.Interop.Excel.This excel sheet resides in another server (NOT on the Servers where the controller/agents reside).
During development of this activity I was faced with issues like this, but after I was done no instances of EXCEL were left hanging. So I thought this was done & dealt with.
With try & error, we observed that when this activity wouldn't ran, no problems would occur.
With this activity running, the very first build after a build-controller reset would succeed, any next build had a certain chance to fail. Once any build failed, no other would succeed until another build-controller reset.
I have only a general understanding of what the problem was (Excel-call is DCOM, TFS services are WCF : How on earth would they interfere?! Why would this sometimes succeed and sometimes fail?! ).
The provided diagnostics were no help either, in fact they mislead us into a loop that continued for months.
If I ever find the time, I 'd like to cleanly reproduce the error & make a Server Fault question out of it...
After removal of this activity it works! I now searched in SO & found this, where J.Saunders comments: "In general, you should never use Office Interop from a server environment". It's ironic that once you get to the bottom of any difficult issue, the whole universe seems to have known about it except you...

Related

Umbraco: Restoring content. Could not get parent with Id

I'm working on an Umbraco cloud project. I pulled the website from the git repositories and built it. First thing to do there when you run the site is to restore the content that's in the development environment to the local project so we can create new features. Yet Umbraco fails to do so with the following error:
The source environment has thrown a Umbraco.Deploy.Exceptions.ProcessArtifactException
with message: Process pass #3 failed for artifact
umb://document/xxthexguidxofxsomexpagexxxxxxxxx. It might have been
caused by an inner Umbraco.Deploy.Exceptions.EnvironmentException with
message: Could not get parent with id xxthexxx-guid-xofx-xthe-xxhomepagexx.
The following artifacts might be involved:
umb://document/xxthexxxguidxofxxthexxhomepagexx
The technical details may contain more information.
I've noticed that I some strange errors occur if not everything is deployed in the development site in the cloud. So I made sure everything is published.. Still errors though... I'm kinda lost here.
Has anyone come across simular issues? And how did you fix it?
Thanks in advance?
This can happen for a number of reasons, so it's a bit hard to say what exactly the problem is in your case.
Most of the time this happens due to either a circular reference of some sort causing a state that can't really be restored. For example that could be a datatype having a dependency on a node - but the node doesn't exist in a blank new environment. The content restore then refuses to start until the structural data (datatypes, contenttypes and such) is completely in sync, but the datatypes will never be able to be in sync until the content node exists. It's a sort of catch22 situation that might need to be resolved manually.
I would suggest you contact support through the Cloud portal and they will assist you in getting your problem resolved.

SonarQube Service Starts, Runs and then Stops?

I have a Windows 2012 R2 server and I managed to install the SonarQube 5.4 server as a Windows Service. I also set up a user so the service can actually start without the infamous "It started then stopped" error a lot of people seem to get. Before installing the server as a windows service, I checked that it worked using StartSonar.bat and it did work just fine, so I was confident when I made it into a service.
But when I try to access http://localhost:9000 there is nothing there, and it appears that shortly after starting the service it stops without any message at all. I can't tell if this is because I try to access the site (which gives me ERR_EMPTY_RESPONSEin Google Chrome) or if it just closes down after a short while.
Anyone got any insight?
I'm a beginner. I came across the same issue and fixed it.
Ensure that the database is running.
My log file (located at sonarqube_home_dir/logs/sonar.log) included the following statement.
Caused by: org.h2.jdbc.JdbcSQLException: Wrong user name or password [28000-176]
Since I'm using the default database, I commented below lines
#sonar.jdbc.username=***
#sonar.jdbc.password=***
at sonarqube_home_dir/conf/sonar.properties.
This must happen due to many reasons like connection problems, permission problems so First, you have to see the logs. /sonarqube-7.6/logs$ tailf sonar.log. then you can find the reason. Once I had the same problem so I did like that. my error is something Directory does not exist: lib/jdbc/mysql
org.sonar.process.MessageException: Directory does not exist: lib/jdbc/mysql reason is I uninstall MySQL and remove all folders name contains "MySQL".
just check whether port 9001 already in used, stop it if already in used.

TFS - Build Service Starts and then throws a HTTP code 500: System.ServiceModel.ServiceActivationException

I have been working on restoring a build server (tfs 2012) from a backup and all manner of things got messed up (the tfsservice account password had been altered and I had to go to every service and app pool on the box and update it). Once sql was backup I was able to update the password via the TFS admin console app. Then I was able to re-register the build service and add a controller and a build agent. It starts briefly and shows green for a few seconds before stopping and a "details ..." button appears next to the Build Service. If I click the details button I see the following
"Please contact your administrator. There was an error contacting the server. Technical information (for administrator): HTTP code 500: System.ServiceModel.ServiceActivationException"
I have checked the http bindings in iis for the tfs site and there is only the one "*:8080"
I tried hardcoding it to the ip on the box and I still get the same error. If I go to one of the client machines and try and queue a build it shows the build server as being offline.
I have also checked for multiple host headers and the memory utilization which are the most common responses to this particular issue. Neither of them seem to be the cause or the solution.
Any ideas or suggestions are welcome I have run out of ideas to try here. Thanks in advance for any help you have to offer.
EDIT -- also found this in the log: Build machine MyMachine lost connectivity to message queue tfsmq://buildservicehost-25/.

ILOG - version 8.0.1

Sometimes when the rules are deployed from the decision center to RES, although the recent changes are visible in the new archive, on RES, but the execution results don't reflect them. It is as if the changes are not recognized at execution time. A second deployment without any changes to the rules, will fix the situation. Can somebody explain why this is happening?
You can try couple of things -
The XU MBean ruleset archive changed/modified notification might be failing. Check if you have the necessary access for this notification. You can try logging into the RES and Diagnostics->Run Diagnostics; and see if there are any errors. Also, you can see the ODM server logs for any errors, when you deploy the ruleset.
Check if there is any caching issue

Starting a windows service fails with error 1053

I have a windows service that is failing to start, giving an error "Error 1053: The service did not respond to the start or control request in a timely fashion".
Running the service in my debugger works fine, and if I double click on the the service .exe on the remote machine a console window pops up and continues to run without problem - I can even see log messages showing me that the program is processing everything the way it should be.
The service had been running fine previously, though this is my first time, personally, trying to deploy it with the most recent changes made to the program. I've evaluated those changes and cant figure out how they might cause this problem, particuarly since everything runs fine when not started as a service.
The StartRoutine() method of the service impelmentation is empty, so should be returning in a "timely fashion".
I've checked the event logs on the computer, and it doesn't give any additional information other than it didn't hear back from the service in the 30 second requisite time frame.
Since it works on my machine, and as a double-clicked executable, how would I go about figuring out why it fails as a service?
Oh, and it's .NET 2.0, so it shouldn't be affected by the 1.1 framework bug that exhibited this symptom (http://support.microsoft.com/kb/839174)
The box is a windows server 2003 R2 machine running SP2.
This is a misleading error. It's probably an unhandled exception.
Empty your OnStart() handler then try this in your constructor...
public MainService()
{
InitializeComponent();
try
{
// All your initialization code goes here.
// For instance, my exception was caused by the lack of registry permissions
;
}
catch (Exception ex)
{
EventLog.WriteEntry("Application", ex.ToString(), EventLogEntryType.Error);
}
}
Now check the EventLog on your system for your Application Error.
Could be a number of things and it might help to get a stack trace on the machine exhibiting the problem. There are a number of ways to do this but the point is that you have to see where this is failing in the code.
You can do this with remote debugging, but a simple thing might be to just log to the event logger, or file log if you have that. Literally, putting "WriteLine("At class::function()") throughout portions of the code to see if you've made it there.
This will at least get you looking in the right direction (which ultimately is the code).
Update:
See Microsoft's How to Debug Windows Services article for details in troubleshooting startup problems using WinDbg.
This related question details nice ways to debug services that are written in .NET.
I agree with Scott, the easiest way to find out what's happening is to put some traces in the start-up code (maybe it doesn't even come to your start-up code).
If this doesn't help, you can post your code here so others can take a look.
perhaps lacking some dependence, try this :
- deregister your service
- register again
If fail at register means that lack an module.
If the StartRoutine is empty, you are probably starting it somewhere else.
IIRC you need to fire off a worker thread, and then return from StartRoutine.
One of the problems which may lead to this error is if windows service which needs to be deployed consists of some error i.e it may be simple authorization error or anything as in my case I have referenced some folders and files for logging which were not existing, but when provided the right path of those file and folders it solved my problem.
I ran through every post on this particular subject and none of the responses solved the problem, so I'm adding this response in case this helps someone else. Admittedly this only applies to a new service, not this specific case.
I was writing a File listening service. As a console app, it worked perfectly. When I ran it as a service, I got the same error as above. What I didn't know (and many of the MSDN articles about services conveniently leave out) is that you need to have your class executed from within ServiceBase.Run( YourClassName());. Otherwise, your app executes and immediately terminates and because it terminated, you get the error above even if no error or exception occurred. Here is a link to an article about this. It actually discusses setting up your app for dual use - Console app and service: Create a combo command line / Windows service app
I had that issue and the source of my problem was config file. I edited it in notepad and notepad added one special character which cause service not to run properly because config file was ruined. I saw that special character in notepadd++ and after delete it, service started to run successfully as previous did.
In my case, the correct .NET framework was not installed on the server that I was installing the Windows service on.
One other reason is If you copy the DLL in 'debug' mode to installation folder this issue will come.What you need to do is Run the project in 'Release' mode copy the DLL or directly form Release folder rather than Debug folder,,and copy that DLL in to installation folder,it will work.You can see the reduction in size of DLL ,it will not contain any debug symbols and like that

Resources