Why is GZIP Compression of a Request Body during a POST method uncommon? - post

I was playing around with GZIP compression recently and the way I understand the following:
Client requests some files or data from a Web Server. Client also sends a header that says "Accept-Encoding,gzip"
Web Server retrieves the files or data, compresses them, and sends them back GZIP compressed to the client. The Web Server also sends a header saying "Content-Encoded,gzip" to note to the Client that the data is compressed.
The Client then de-compresses the data/files and loads them for the user.
I understand that this is common practice, and it makes a ton of sense when you need to load a page that requires a ton of HTML, CSS, and JavaScript, which can be relatively large, and add to your browser's loading time.
However, I was trying to look further into this and why is it not common to GZIP compress a request body when doing a POST call? Is it because usually request bodies are small so the time it takes to decompress the file on the web server is longer than it takes to simply send the request? Is there some sort of document or reference I can have about this?
Thanks!

It's uncommon because in a client - server relationship, the server sends all the data to the client, and as you mentioned, the data coming from the client tends to be small and so compression rarely brings any performance gains.
In a REST API, I would say that big request payloads were common, but apparently Spring Framework, known for their REST tools, disagree - they explicitly say in their docs here that you can set the servlet container to do response compression, with no mention of request compression. As Spring Framework's mode of operation is to provide functionality that they think lots of people will use, they obviously didn't feel it worthwhile to provide a ServletFilter implementation that we users could employ to read compressed request bodies.
It would be interesting to trawl the user mailing lists of tomcat, struts, jackson, gson etc for similar discussions.
If you want to write your own decompression filter, try reading this: How to decode Gzip compressed request body in Spring MVC
Alternatively, put your servlet container behind a web server that offers more functionality. People obviously do need request compression enough that web servers such as Apache offer it - this SO answer summarises it well already: HTTP request compression - you'll find the reference to the HTTP spec there too.

Very old question but I decided to resurrect it because it was my first google result and I feel the currently only answer is incomplete.
HTTP request compression is uncommon because the client can't be sure the server supports it.
When the server sends a response, it can use the Accept-Encoding header from the client's request to see if the client would understand a gzipped response.
When the client sends a request, it can be the first HTTP communication so there is nothing to tell the client that the server would understand a gzipped request. The client can still do so, but it's a gamble.
Although very few modern http servers would not know gzip, the configuration to apply it to request bodies is still very uncommon. At least on nginx, it looks like custom Lua scripting is required to get it working.

Don't do it, for no other reason than security. Firewalls have a hard or impossible time dealing with compressed input data.

Related

POST with TIdHTTP hangs on retrieving the JSON response

This question is maybe more a tip for people to search a solution if they have the same problem (as I found the solution eventually).
I had an application that does some HTTP requests with a local server (a mix of GET/POST with JSON content in the request/response bodies). The server is a third-party application, and after I upgraded it to a recent version, my Delphi app was no longer working.
It turned out that it was now hanging on the statement:
IdHTTP.Post("URL", "Payload", "BytesStreamResult");
As a manual POSTMAN request was still working, it had to be on the Delphi client side.
Further isolating the issue showed that the HTTP POST request did get an HTTP 200 response with valid HTTP response headers, but then was getting stuck reading the response body. It was hanging on:
IOHandler.ReadLn
When I compared the headers with the POSTMAN response, I noticed that 'Transfer-Encoding: chunked' was missing in the Delphi response.
Finally, I noticed the code related to TIdHTTP's hoKeepOrigProtocol option, which is not set by default.
So, my POST request was "downgraded" to an HTTP 1.0 request, and I guess this now made the (updated) server to respond differently (I'm not an RFC expert, but I guess 'chunked' is maybe an HTTP 1.1 option only).
After setting this option, everything worked like before (and indeed, the response was now read as "chunked" in Delphi).
Summary:
Shouldn't hoKeepOrigProtocol be the default option? (why punish good citizens for those that are not...)
Can we intercept this? Now my POST is assuming upfront a streamed response and thus it hangs because the server doesn't write anything to the buffer.
What would that high-level code look like? As it seems a mix of interpreting the header response headers and then deciding if more response reading is required.
(it didn't do anything specific regarding time-outs, either. I have the impression it hangs forever, or at least > 10 minutes...)
TIdHTTP supports non-chunked responses (which yes, is an HTTP 1.1 feature), so the hanging would have to be caused by the server sending a malformed response (a bug that should be reported to the server author).
When reading a non-chunked and non-MIME response, TIdHTTP does not use IOHandler.ReadLn to read the response's body, as you claim. Only when reading the response's headers.
But, since you did not show what the response actually looks like, nobody can explain for sure exactly why the hang occurs.
Shouldn't hoKeepOrigProtocol be the default option?
At the time the option was first introduced, no. There were plenty of buggy HTTP 1.1 servers around that downgrading to HTTP 1.0 was warranted.
However, that was many years ago. Nowadays, HTTP 1.1 is much more mature, and such buggy servers are rare. So, feel free to submit a change/pull request to Indy's GitHub repo if you feel the default behavior should be changed.
Can we intercept this?
No. The behavior you describe is most likely caused by a bug in the HTTP server. Either it is not sending all of the data it should be, or else the response is likely malformed in a way that makes TIdHTTP expect more data than is actually being sent. Either way, all you can do is assign a non-infinite timeout to TIdHTTP.
it didn't do anything specific regarding time-outs, either. I have the impression it hangs forever, or at least > 10 minutes.
Indy is designed to use infinite timeouts by default. You can assign custom timeouts to TIdHTTP's ConnectTimeout and ReadTimeout properties.
Setting this prevent the HTTP protocol downgrade:
IdHTTP.HTTPOptions := IdHTTP.HTTPOptions + [hoKeepOrigProtocol];
This is, of course, dependant upon how the server processes the protocol specification, and if it results in issues or not.

Real use of same origin policy

I just got to know about the same origin policy in WebAPI. Enabling CORS helps to call a web service which is present in different domain.
My understanding is NOT enabling CORS will only ensure that the webservice cannot be called from browser. But if I cannot call it from browser I still can call it using different ways e.g. fiddler.
So I was wondering what's the use of this functionality. Can you please throw some light? Apologies if its a trivial or a stupid question.
Thanks and Regards,
Abhijit
It's not at all a stupid question, it's a very important aspect when you're dealing with web services with different origin.
To get an idea of what CORS (Cross-Origin Resource Sharing) is, we have to start with the so called Same-Origin Policy which is a security concept for the web. Sounds sophisticated, but only makes sure a web browser permits scripts, contained in a web page to access data on another web page, but only if both web pages have the same origin. In other words, requests for data must come from the same scheme, hostname, and port. If http://player.example tries to request data from http://content.example, the request will usually fail.
After taking a second look it becomes clear that this prevents the unauthorized leakage of data to a third-party server. Without this policy, a script could read, use and forward data hosted on any web page. Such cross-domain activity might be used to exploit cookies and authentication data. Therefore, this security mechanism is definitely needed.
If you want to store content on a different origin than the one the player requests, there is a solution – CORS. In the context of XMLHttpRequests, it defines a set of headers that allow the browser and server to communicate which requests are permitted/prohibited. It is a recommended standard of the W3C. In practice, for a CORS request, the server only needs to add the following header to its response:
Access-Control-Allow-Origin: *
For more information on settings (e.g. GET/POST, custom headers, authentication, etc.) and examples, refer to http://enable-cors.org.
For a detail read, use this https://developer.mozilla.org/en/docs/Web/HTTP/Access_control_CORS

Controlling IIS BITS uploads

I'm running an IIS web site (built using ASP.NET/MVC) that among other things collects files from multiple agents that anonymously upload the files via BITS.
I need to make sure that only files uploaded from known sources as well as matching certain predefined file name pattern will be accepted by IIS. All other BITS upload attempts must be cancelled.
As I understand, BITS uses an ad hoc protocol over HTTP 1.1 using "BITS_POST" verb. So, ideally, I'd like to hook into IIS, analyze a BITS_POST request info and if it does not satisfy my pre-conditions, drop the request.
I've tried to create and register a filter implementing IActionFilter.OnActionExecuting, but it seems that my filter does not receive BITS_POST requests.
I'd be glad to hear if somebody have implemented similar BITS related solutions and how this was done. Anyway, other ideas are welcome too.
Regards,
Natan
I have never worked with BITS, frankly i dont know what is it.
What i usually do is such situations is implement an HTTP module. On its begin request event, you can iterate through incoming HTTP request data and decide to stop processing the request if data is not complying with requirements. You have full access to HttpContext.Current.Request object from HTTP module code.
With HTTP modules, you can execute .NET code even before entering the ASP.NET pipeline.

Best way for uploading or downloading images in ios . FTP vs HTTP

Best way for upload or download images in ios?
in ios I can upload images and upload images on server by via ftp. I also saw many person use HTTP post methods for upload or download image in shape of NSData.
so which method is fast and secure?
HTTP is the better choice because port 80 is almost always open while port 21 is often closed in business settings.
Neither are faster or more secure for your IOS app. In general FTP is not the most secure technology to be running on your server (sFTP is better), so many people prefer not to run FTP servers, and therefore have to use HTTP for uploads (as Zaph says, on many firewalls, FTP is not even allowed by default for this reason).
But using HTTP for uploads that requires code on your server to handle HTTP POST and put the files in the correct location. The fact that you are writing this code potentially makes it safer: you can validate the incoming data, make sure it is the right size and filetype and take account of any user bandwidth or storage limits.
You don't use HTTP post to download images, but HTTP GET. That doesn't require you to use anything special on the server, and HTTP server can serve it.
Unless you have a good reason not to, I'd suggest using HTTP. A good reason might be that you're integrating your app with an existing FTP service.

Best way to redirect image requests to a different webserver?

I am trying to reduce the load on my webservers by adding an "Image server" (a dedicated server for handling image requests), and redirecting all requests for .gif,.jpg,.png etc., to it.
My question is, what is the best way to handle the redirection?
At the firewall level? (can I do this using iptables?)
At the load balancer level? (can ldirectord handle this?)
At the apache level - using rewrite rules?
Thanks for any suggestions on the best way to do this.
--Update--
One thing I would add is that these are domains that are hosted for 3rd parties, so I can't expect all the developers to modify their code and point their images to another server.
The further up the chain you can do it, the better.
Ideally, do it at the DNS level by using a different domain for your images (eg imgs.example.com)
If you can afford it, get someone else to do it by using a CDN (Content delivery network).
-Update-
There are also 2 featuers of apache's mod_rewrite that you might want to look at. They are all described well at http://httpd.apache.org/docs/1.3/misc/rewriteguide.html.
The first is under the heading "Dynamic Miror" in the above document, that uses the mod_rewrite Proxy flag [p]. This lets your server silently fetch files from another domain and return them.
The second is to just redirect the request to the new domain. This second option puts less strain on your server, but requests still need to come in and it slows down the final rendering of the page, as each request needs to make an essentially redundant request to your server first.
i agree with rikh. If you want images to be served from a different webserver, then serve them on a different web-server. For example:
<IMG src="images/Brett.jpg">
becomes
<IMG src="http://brettnesbitt.akamia-technologies.com/images/Brett.jpg">
Any kind of load balancer will still feed the image from the web-server's pipe, which is what you're trying to avoid.
i, of course, know what you really want. What you really want is for any request like:
GET images/Brett.jpg HTTP/1.1
to automatically get converted into:
HTTP/1.1 307 Temporary Redirect
Location: http://brettnesbitt.akamia-technologies.com/images/Brett.jpg
this way you don't have to do any work, except copy the images to the other web-server.
That i really don't know how to do.
By using the phrase "NAT", it implies that the firewall/router receives HTTP requests, and you want to forward the request to a different internal server if the HTTP request was for image files.
This then begs the question about what you're actually trying to save. No matter which internal web-server services the HTTP request, the data is still going to have to flow through the firewall/router's pipe.
The reason i bring it up is because the common scenario when someone wants to serve images from a different server is because they want to split up high-bandwidth, mostly static, low-CPU cost content from their actual logic.
Only using NAT to re-write the packet and send it to a different server will not work towards that common issue.
The other reason might be because images are not static content on your system, and a request to
GET images/Brett.jpg HTTP/1.1
actually builds an image on the fly, with a high-CPU cost, or only using with data available (i.e. SQL Server database) to ServerB.
If this is the case then i would still use a different server name on the image request:
GET http://www.brettsoft.com/default.aspx HTTP/1.1
GET http://imageserver.brettsoft.com/images/Brett.jpg HTTP/1.1
i understand what you're hoping for, with network packet inspection to override the NAT rule and send it to another server - i've never seen any such thing that can do that.
It sounds more "proxy-ish", where the web-proxy does this. (i.e. pfSense and m0n0wall can't do it)
Which then leads to a kind of solution we used once: a custom web-server that analyzes the request, makes the appropriate request off some internal server, and binary writes the response to the client.
That pain in the ass solution was insisted upon by a "security consultant", who apparently believes in security through obscurity.
i know IIS cannot do such things for you itself - i don't know about other web-server products.
i just asked around, and apparently if you wanted to write a custom kernel module for you linux based router, you could have it inspect packets and take appropriate action. Such a module might exist. There are, apparently, plenty of other open-sourced modules to use as a starting point.
But i'd rather shoot myself in the head.

Resources