TFS 2015 Intermittent connection issues - tfs

For the last few weeks our office has experienced intermittent and fairly crippling connection issues to our on premises TFS 2015 (w/update3). When it occurs visual studio basically becomes useless and web view pages of TFS either don't load the page, load the header toolbars or take several minutes and eventually open. Queries take minutes to run and so on. Then suddenly all will be well and it works again perfectly.
There are no errors shown when this happens, either to the user or in the TFS application event logs. The system is not overloaded on any resources. I have tried various things like; rebooting (obviously!), iisrest, cleared cache on app tier and who knows what else at this point.
Are there any other logs I could be looking at or things I could try to diagnose?
Worth noting: Users have all recently migrated to a new domain but the TFS servers are still on the old domain. However we had migrated in my office long before these issues occurred. Other offices who connect in have only recently migrated into new domain.
System setup is VMware 6.0, TFS app tier with separate SQL data tier and analysis database.

According to your description: There are no errors shown when this happens, either to the user or in the TFS application event logs.
Agree with Daniel in the comment, this kind of issue should not related to TFS server side. If the server always be slowly, this should be a performance issue. However, you said suddenly all will be well and it works again perfectly. Then the most possibility should be network related issue. Suggest your team use some Network Analyzer Tools to trouble shoot it.
For example, double check the DNS related area. Your TFS server is on one domain and some users are on another. First make sure the domains are trusted each other, double check that if the slow performance was being caused by an authentication issue.
One of the fix is to have users log in using the full domain name. A sample: if they are currently logging in with DEV\MyUserAccount, then they should instead log in with DEV.COM\MyUserAccount.
It has something to so with how the TFS server is looking up the accounts when a short domain name is used. It is pre-pending the name to all of the dns suffixes which in turn ends up creating bad ones and causes delays as it's not finding any valid domains.
Besides, about the performance issue, you could also take a look at this great answer from jessehouwing.

Related

Production website becomes unresponsive on certain pages

I have a weird issue that just started popping up for our customers. The portal they've been using for years has started freezing on some of the pages that the user navigates to. I tried restarting the IIS Server, the site within and the Application Pool under which the site is site is running. No difference.
In Chrome Dev Tools I can see that it is always one of these three calls that take time to complete:
When it happens, one of those three calls will report that the request is not finished, like this:
When eventually the call completes, I can see that the Content Download took 3.8 minutes. Not sure whether it is relevant or not, but it is always 3.8 minutes:
Did anyone else encounter a similar situation? Is there a suggestion on how to figure out what is happening all of a sudden that triggers these type of behaviours?
TIA,
Ed
Edit: The resource that fails to load after 3.8 minutes always generates a net::ERR_CONNECTION_RESET error:
Edit2: Thanks to all of you trying to help. A little update: I was able to isolate to problem to an issue with the server not serving some of the files. either *.css or *.js. The setting is that of two identical servers placed behind a load balancer. Apparently, the load balancer software was recently updated and right after that we started having these issues. I am working closely with the IT department of our client, trying to figure out what is the impact of the newer version that seems to have triggered all this drama.

How to Debug the asp.net web application without disturbing the release build?

I'm working asp.net web based application, I have deployed this application on server, Its getting response on port 80 from a outside client.
I want the to fix the bugs so I want to run this application in Debug mode so that I can attach the worker process with the application and this is making the Performance down and its disturbing the QA team.
So can I have two application one can run in release mode so that QA activity does not get disturbed and parallelly I can debug the build and fix the bugs or can do further development.
I'm facing the same problem during the development activity, If multiple developers are working paralley , only one is able to debug the application other one has to wait.
So please suggest me, If I can get rid of this situation.
I have only one server on which I can test this application.
This is a way too long discussion, but I will try to offer you a few ideas:
Each developer should develop on his own machine (sources and database should be local).
In order to sync your work you should use:
a. a source control solution like TFS or SVN (this is free) for your sources.
b. database changes can easily be synced by generating update scripts using SQL Schema Compare directly from Visual Studio (you will need SQL Server Data Tools for this), Redgate SQL Compare or another application that can compare the database strucure (there are many available online, some of them free).
You should have a separate server (DB and app) to testing.
You should have a separate server (DB anb app) for production.
You say you have one server to test the application. But I suppose each developer has his own computer, right? In this case you need to skip #3 and use the same server for testing and production, but with different databases and applications.
I suggest you check this website for similar answers (see Best practice for test and production environments for example) to find the best solution that applies in your case.

Heroku “psql: FATAL: remaining connection slots are reserved for non-replication superuser connections”

I got the above error message running Heroku Postgres Basic (as per this question) and have been trying to diagnose the problem.
One of the suggestions is to use connection pooling but it seems Rails has this built in. Another suggestion is that the app is configured improperly and opens too many connections.
My app manages all it's connections through Active Record, and I had one direct connection to the database from Navicat (or at least I thought I had).
How would I debug this?
RESOLUTION
Turns out it was an Heroku issue. From Heroku support:
We've detected an issue on the server running your Basic database.
While we pinpoint this and address it, we would recommend you
provision a new Basic database and migrate over with PGBackups as
detailed here:
https://devcenter.heroku.com/articles/upgrade-heroku-postgres-with-pgbackups
. That should put your database on a new server. I apologize for this
disruption – we're working to fix this issue and prevent it from
occurring in the future.
This has happened a few times on my app -- somehow there is a connection leak, then all of a sudden the database is getting 10 times as many connections as it should. If it is the case that you are getting swamped by an error like this, not traffic, try running this:
heroku pg:killall
That will terminate all connections to the database. If it is dangerous for your situation to possibly cut off queries be careful. I just have a rails app, and if it goes down, losing a couple queries is not a big deal, because the browser requests will have looooooong since timed out anyway.
You might be able to find why you have so many connections by inspecting view pg_stat_activity:
SELECT * FROM pg_stat_activity
Most likely, you have some stray loop that opens new connection(s) without closing it.
To save you the support call, here's the response I got from Heroku Support for a similar issue:
Hello,
One of the limitations of the hobby tier databases is unannounced maintenance. Many hobby databases run on a single shared server, and we will occasionally need to restart that server for hardware maintenance purposes, or migrate databases to another server for load balancing. When that happens, you'll see an error in your logs or have problems connecting. If the server is restarting, it might take 15 minutes or more for the database to come back online.
Most apps that maintain a connection pool (like ActiveRecord in Rails) can just open a new connection to the database. However, in some cases an app won't be able to reconnect. If that happens, you can heroku restart your app to bring it back online.
This is one of the reasons we recommend against running hobby databases for critical production applications. Standard and Premium databases include notifications for downtime events, and are much more performant and stable in general. You can use pg:copy to migrate to a standard or premium plan.
If this continues, you can try provisioning a new database (on a different server) with heroku addons:add, then use pg:copy to move the data. Keep in mind that hobby tier rules apply to the $9 basic plan as well as the free database.
Thanks,
Bradley

Ruby/RoR development: locally or server

Our company has started development of own systems "in-house". We already got couple of developers, who will be responsible for writing code in Ruby/RoR.
We are currently discussing about infrastructure and I would like to ask: should we develop everything on local machines, then put it to test server and later to production, or develop everything on development/test server, then publish it for testing and later to production?
Just an update to the description above: under "local machines" I meant developers' desktops and this test/development server is a machine in our office.
It's a valid question, and as such there's a trade-off to consider.
Generally; work locally. Web app development has a natural flow that leads developers to be saving and refreshing browsers many times an hour. All the time you save on network latency will actually add up, and be less frustrating for the developers.
There are downsides to working locally however, you'll need to make sure that your set-up is EXACTLY as it will be on the testing/production servers. That means everything down to your kernel version, apache version, ruby/rails version. DNS is easy, but again must mimic the live situation perfectly in order for AJAX calls etc to work seamlessly.
Even if you ensure all of the above, you will likely have to make a few minor changes when you move the app to a live server, there just always seems to be something in my experience.
Also, running on a live server isn't SO painful for a developer. Saving a source file from a text editor/IDE via FTP should take less than a second even over the internet, and refreshing a remote browser session will give your UI designers a better feel for the real user experience and flow. If you use SVN rather than FTP much the same applies.
Security isn't much of a concern, lock down FTP and SSH to the office IP, but have a backdoor available if a developer needs to edit a source from somewhere else, so they can temporarily open the firewall to their own IP.
I have developed PHP and Rails apps on a remote test server, on an in-office server and on a local machine. After many years doing each, I can say that as a developer, I don't mind any so much.
As a developer, my suggestion is that you need to 1st do all developing work on your local server. After testing, you need to send to client to make it live.
I'm working as a web developer on Ruby on Rails # andolasoft.com, we are following the same procedure. Hope you got the idea.
Thanks

ASP.Net MVC Poor Production Performance

We have a MVC 3 application which has been deployed onto a newly built Windows 2008 R2 Web Edition server which is performing badly.
This application has been through development, quality assurance and user acceptance testing cycles on the same operating system (different boxes) with no performance issues.
The only difference we can see with the server is that it sits in the DMZ and as such has two network adapters configured, one for the internet, and one to punch through the firewall.
We have put all sorts of logging into the application and confirmed that up until the 'return ActionResult' everything is working correctly (ie ~500ms). It then takes 15 seconds to render the page.
We have tried turning on debug=false in the config file, i'm not sure what else to look for here, it seems like an environment issue.
Any suggestions please ? I am about to investigate if the thread pool size could be causing problems.
Also, if it helps the page is using multiple partial views, i have read others having problems with them.
Thanks,
Matt
Since the application performs ok in other environments I would suggest you investigate following:
Database - are you running against different database? How long the queries execute? If you have non-optimized database with million records on production, and only few records in test you want find performance problems soon enough.
Network - what is the latency between web box and database? If you loose 100ms for each database query just because of network than if your page triggers 50 queries you've lost 5secs. I've seen poorly configured routers / load balancers that were doing just that.
Try profiling each component of your system (db, network, web box) in order to find out where you're wasting all that time. Try http://code.google.com/p/mvc-mini-profiler/.
PS. You MUST have debug=false in your prod env.

Resources