TFS check in timeout of changeset containing "larger" binary files - tfs

I'm performing a TFS Integration migration from tfs.visualstudio to an on-premise 2012 server. I'm running into a problem with a particular changeset that contains multiple binary files in excess of 1 MB, a few of which are 15-16 MB. [I'm working remotely (WAN) with the on-premise TFS]
From the TFSI logs, I'm seeing: Microsoft.TeamFoundation.VersionControl.Client.VersionControlException: C:\TfsIPData\42\******\Foo.msi: The request was aborted: The request was canceled.
---> System.Net.WebException: The request was aborted: The request was canceled.
---> System.IO.IOException: Cannot close stream until all bytes are written.
Doing some Googling, I've encountered others running into similar issues, not necessarily concerning TFS Integration. I'm confident this same issue would arise if I were just checking in a changeset like normal that met the same criteria. As I understand it, when uploading files (checking in), the default chunk size is 16MB and the timeout is 5 minutes.
My internet upload speed is only 1Mbit/s at this site. (While I think the problem would mitigated with sufficient upload bandwidth, it wouldn't solve the problem).
Using TCPView, I've watched the connections to the TFS server from my client while the upload was in progress. What I see is 9 simultaneous connections. My bandwidth is thus getting shared among 9 file uploads. Sure enough, after about 5 minutes the connections crap out before the upload byte counts can finish.
My question is, how can I configure my TFS client to utilize fewer concurrent connections, and/or smaller chunk sizes, and/or increased timeouts? Can this be done globally somewhere to cover VS, TF.EXE, and TFS Integration?

After spending some time with IL DASM poking around in Microsoft.TeamFoundation.VersionControl.Client.dll FileUploader, I discovered in the constructor the string VersionControl.UploadChunkSize. It looked like it is used to override the default chunk size (DefaultUploadChunkSize = 0x01000000).
So, I added this to TfsMigrationShell.exe.config
<appSettings>
<add key="VersionControl.UploadChunkSize" value="2097152" />
</appSettings>
and ran the VC migration again -- this time it got past the problem changeset!
Basically the TFS client DLL will try and upload multiple files simultaneously (9 in my case). Your upload bandwidth will be split among the files, and if any individual file transfer cannot complete 16MB in 5 minutes, the operation will fail. So you can see that with modest upload bandwidths, changesets containing multiple binary files can possibly timeout. The only thing you can control is the bytecount of each 5 minute timeout chunk. The default is 16MB, but you can reduce it. I reduced mine to 2MB.
I imagine this could be done to devenv.exe.config to deal with the same problem when performing developer code check ins. Hopefully this information will help somebody else out and save them some time.

Related

How do I store files greater than 5GB in size on S3 using ActiveStorage?

I have an application hosted on Heroku. Part of it's job is to store / serve up files of varying sizes, zipped up in bundles. (We're planning to phase out the bundling process at a later date, but that's going to be a major revamp of consuming software)
The 5GB limit on file uploads to S3 (caused by S3 requiring multiple part uploads) is becoming increasingly untenable for our use-case. In fact, it's become an outright pain and outright unacceptable to the business model.
Rails 6.1 is supposed to fix this, but we can't wait for it to come out, especially since there isn't an ETA on it yet. I tried using the alpha version off master, and got hit with an error about not being able to load coffee script (which is weird since I don't use coffeescript).
I'm now trying to find other viable alternatives that will allow our application to store files of 5GB or larger. I'm experimenting with compressing the files, but that isn't a long-term solution either.

Very slow POST request with Ruby on Rails

We are 2 working on a website with Ruby on Rails that receives GPS coordinates sent by a tracking system we developped. This tracking system send 10 coordinates every 10 seconds.
We have 2 servers to test our website and we noticed that one server is processing the 10 coordinates very quickly (less than 0.5 s) whereas the other server is processing the 10 coordinates in 5 seconds minimum (up to 20 seconds). We are supposed to use the "slow" server to put our website in production mode this is why we would try to solve this issue.
Here is an image showing the time response of the slow server (on the bottom we can see 8593 ms).
Slow Server
The second image shows the time response of the "quick" server.
Fast Server
The version of the website is the same. We upload it via Github.
We can easily reproduce the problem by sending fake coordinates with POSTMan and difference of time between the two servers remain the same. This means the problem does not come from our tracking system in my opinion.
I come here to find out what can be the origins of such difference. I guess it can be a problem from the server itself, or from some settings that are not imported with Github.
We use Sqlite3 for our database.
However I do not even know where to look to find the possible differences...
If you need further information (such as lscpu => I am limited to a number of 2 links...) in order to help me, please do not hesitate. I will reply very quickly as I work on it all day long.
Thank you in advance.
EDIT : here are the returns of the lscpu commands on the server.
Fast Server :
Slow Server :
May be one big difference is the L2 cache...
My guess is that the answer is here but how can I know what is my value of pragma synchronous and how can I change it ?
The size of the .sqlite3 file I use is under 1 Mo for the tests. Both databases should be identical according to my schema.rb file.
The provider of the "slow" server solved the problem, however I do not know the details. Some things were consuming memory and slowing down everything.
By virtual server, it means finally that several servers are running on the same machine, each is attributed a part of the machine.
Thanks a lot for your help.

Icinga - check_yum - Socket Timeout?

I'm using the check_yum - Plugin in my Icinga-Monitoring-Environment to check if there are security critical updates available. This works quite fine but sometimes I get a " CHECK_NRPE: Socket timeout after xx seconds." while executing the check. Currently my NRPE-Timeout is 30 seconds.
If I re-schedule the check a few times or executing the check directly from my Icinga-Server with a higher nrpe-timeout-value everything works fine, at least after a few executions of the check. All other checks via NRPE are not throwing any errors. So I think there is no general error with my NRPE-config or the plugins I'm using. Is there some explanation for this strange behaviour of the check_yum - plugin? Maybe some caching issues on the monitored servers?
First, be sure you are using the 1.0 version of this check from: https://code.google.com/p/check-yum/downloads/detail?name=check_yum_1.0.0&can=2&q=
The changes I've seen in that version could fix this issue, depending on it's root cause.
Second, if your server(s) are not configured to use all 'local' cache repos, then this check will likely time out before the 30 second deadline. Because: 1> the amount of data from the refresh/update is pretty large and may be taking a long time to download from remote (include RH proper) servers and 2> most of the 'official' update servers tend to go off-line A LOT.
Best solution I've found is to have a cronjob to perform your update check at a set interval (I use weekly) and create a log file containing those security patches the system(s) require. Then use a Nagios check, via a simple shell script, to see if said file has any new items in it.

IIS7 "Not enough storage is available to process this command" intermittent error in HTTPHandler

So, I have a web site that serves videos via a HTTP handler, which is our security layer. Some clients have reported that the videos are not working intermittently. I was finally able to reproduce the issue, and our logging coded reports success in validating the user, then the line:
Response.WriteFile(filename); // Where this is the path to a video of about 32 MB
throws the above exception. I found the actual error by viewing the request and response with Fiddler. But the server has 2 GB of memory free, and the videos started working again an hour or so later ( which probably equates to less people using the server, but nothing was changed on it ). We run two websites on this machine, and the other never has issues like this, but it also doesn't use a layer like this where .NET code is responsible for writing the file. I don't see any settings that allow me to change the available memory, nor has google thrown up anything useful. Any suggestions appreciated.
I should add, I stopped and started and then restarted my site, I've had issues that are solved in the short term by doing this in the past. This did not help.
I just ran into this problem trying to Response.WriteFile a ~170 MB pdf.
In my case, using Response.TransmitFile instead worked; (maybe) because it
Writes the specified file directly to an HTTP response output stream,
without buffering it in memory.

TFS - Cannot Check-In "big" files

I've been working with TFS for months without a problem, now suddently I can't check-in "big" files (2kb files are ok, but 50kb files or multiple files are not). TFS is hosted on a server in the same network.
When I try to check-in, it gives-me an error like: "Check In: Operation not performed : The underlying connection was closed: An unexpected error occurred on a receive.
Please refer to the Output window for more information". The "more information" in output window is just the same error.
The event viewer of the server shows nothing, and I've been looking in Google for the past couple hours and turned out nothing yet.
The error message The underlying connection was closed is an indicator that something between the client and the server is dropping the connection unexpectedly.
Some things to investigate/try:
Is the application pool on the server restarting? Look at the Application Event Log on the AT server. Look for ASP.NET and W3SVC warnings/errors that indicate the app pool has restarted.
How is the client connecting to the server? Is there a HTTP proxy in the middle? Is the server behind a load balancer or firewall device? What is the idle timeout set to? Is it honoring HTTP Keep-Alive settings?
Is it failing for all clients? Can you checkin the same file on the TFS server itself?
If none of those seem to lead you in the right direction, you'll need to setup a Fiddler or NetMon trace on your client and/or your server.
We had a similar problem in TFS 2008, where the lock table was getting big and causing some problems. Take a look at tbl_lock in your TfsVersionControl database. This should be a very small number of rows.
In our experience when this number approached or exceeded 1,000 rows, we started seeing significant issues with check-ins.
Our resolution to this was to turn on merging for the binary files that we were storing.

Resources