POST with TIdHTTP hangs on retrieving the JSON response - delphi

This question is maybe more a tip for people to search a solution if they have the same problem (as I found the solution eventually).
I had an application that does some HTTP requests with a local server (a mix of GET/POST with JSON content in the request/response bodies). The server is a third-party application, and after I upgraded it to a recent version, my Delphi app was no longer working.
It turned out that it was now hanging on the statement:
IdHTTP.Post("URL", "Payload", "BytesStreamResult");
As a manual POSTMAN request was still working, it had to be on the Delphi client side.
Further isolating the issue showed that the HTTP POST request did get an HTTP 200 response with valid HTTP response headers, but then was getting stuck reading the response body. It was hanging on:
IOHandler.ReadLn
When I compared the headers with the POSTMAN response, I noticed that 'Transfer-Encoding: chunked' was missing in the Delphi response.
Finally, I noticed the code related to TIdHTTP's hoKeepOrigProtocol option, which is not set by default.
So, my POST request was "downgraded" to an HTTP 1.0 request, and I guess this now made the (updated) server to respond differently (I'm not an RFC expert, but I guess 'chunked' is maybe an HTTP 1.1 option only).
After setting this option, everything worked like before (and indeed, the response was now read as "chunked" in Delphi).
Summary:
Shouldn't hoKeepOrigProtocol be the default option? (why punish good citizens for those that are not...)
Can we intercept this? Now my POST is assuming upfront a streamed response and thus it hangs because the server doesn't write anything to the buffer.
What would that high-level code look like? As it seems a mix of interpreting the header response headers and then deciding if more response reading is required.
(it didn't do anything specific regarding time-outs, either. I have the impression it hangs forever, or at least > 10 minutes...)

TIdHTTP supports non-chunked responses (which yes, is an HTTP 1.1 feature), so the hanging would have to be caused by the server sending a malformed response (a bug that should be reported to the server author).
When reading a non-chunked and non-MIME response, TIdHTTP does not use IOHandler.ReadLn to read the response's body, as you claim. Only when reading the response's headers.
But, since you did not show what the response actually looks like, nobody can explain for sure exactly why the hang occurs.
Shouldn't hoKeepOrigProtocol be the default option?
At the time the option was first introduced, no. There were plenty of buggy HTTP 1.1 servers around that downgrading to HTTP 1.0 was warranted.
However, that was many years ago. Nowadays, HTTP 1.1 is much more mature, and such buggy servers are rare. So, feel free to submit a change/pull request to Indy's GitHub repo if you feel the default behavior should be changed.
Can we intercept this?
No. The behavior you describe is most likely caused by a bug in the HTTP server. Either it is not sending all of the data it should be, or else the response is likely malformed in a way that makes TIdHTTP expect more data than is actually being sent. Either way, all you can do is assign a non-infinite timeout to TIdHTTP.
it didn't do anything specific regarding time-outs, either. I have the impression it hangs forever, or at least > 10 minutes.
Indy is designed to use infinite timeouts by default. You can assign custom timeouts to TIdHTTP's ConnectTimeout and ReadTimeout properties.

Setting this prevent the HTTP protocol downgrade:
IdHTTP.HTTPOptions := IdHTTP.HTTPOptions + [hoKeepOrigProtocol];
This is, of course, dependant upon how the server processes the protocol specification, and if it results in issues or not.

Related

Why is GZIP Compression of a Request Body during a POST method uncommon?

I was playing around with GZIP compression recently and the way I understand the following:
Client requests some files or data from a Web Server. Client also sends a header that says "Accept-Encoding,gzip"
Web Server retrieves the files or data, compresses them, and sends them back GZIP compressed to the client. The Web Server also sends a header saying "Content-Encoded,gzip" to note to the Client that the data is compressed.
The Client then de-compresses the data/files and loads them for the user.
I understand that this is common practice, and it makes a ton of sense when you need to load a page that requires a ton of HTML, CSS, and JavaScript, which can be relatively large, and add to your browser's loading time.
However, I was trying to look further into this and why is it not common to GZIP compress a request body when doing a POST call? Is it because usually request bodies are small so the time it takes to decompress the file on the web server is longer than it takes to simply send the request? Is there some sort of document or reference I can have about this?
Thanks!
It's uncommon because in a client - server relationship, the server sends all the data to the client, and as you mentioned, the data coming from the client tends to be small and so compression rarely brings any performance gains.
In a REST API, I would say that big request payloads were common, but apparently Spring Framework, known for their REST tools, disagree - they explicitly say in their docs here that you can set the servlet container to do response compression, with no mention of request compression. As Spring Framework's mode of operation is to provide functionality that they think lots of people will use, they obviously didn't feel it worthwhile to provide a ServletFilter implementation that we users could employ to read compressed request bodies.
It would be interesting to trawl the user mailing lists of tomcat, struts, jackson, gson etc for similar discussions.
If you want to write your own decompression filter, try reading this: How to decode Gzip compressed request body in Spring MVC
Alternatively, put your servlet container behind a web server that offers more functionality. People obviously do need request compression enough that web servers such as Apache offer it - this SO answer summarises it well already: HTTP request compression - you'll find the reference to the HTTP spec there too.
Very old question but I decided to resurrect it because it was my first google result and I feel the currently only answer is incomplete.
HTTP request compression is uncommon because the client can't be sure the server supports it.
When the server sends a response, it can use the Accept-Encoding header from the client's request to see if the client would understand a gzipped response.
When the client sends a request, it can be the first HTTP communication so there is nothing to tell the client that the server would understand a gzipped request. The client can still do so, but it's a gamble.
Although very few modern http servers would not know gzip, the configuration to apply it to request bodies is still very uncommon. At least on nginx, it looks like custom Lua scripting is required to get it working.
Don't do it, for no other reason than security. Firewalls have a hard or impossible time dealing with compressed input data.

Delphi - Connecting and logging in to a webpage

EDIT
There has been quite a development. The current problem is this:
I compared requests sent from a browser and sent from my app. There have been some differences and I managed to correct most of them. Some are still unfixed, since I haven't figured it out how yet. I am using INDY.
How can i send (or add) cookies into the request?
I tried this:IdHTTP.CookieManager.AddCookie('bakatheme=BrectanTheme',IdHTTP1.URL) but it doesn't work. Also, in INDY help they say that it is supposed to be AddCookie(String, String), but my Delphi only accept (String, TIdURI) - I am not sure if it is the right URI I am calling.
In the Headers I have this code: AcceptEncoding:='gzip,deflate,sdch'; yet when I parse the outgoing request, it states this: AcceptEncoding:gzip,deflate,sdch,identitybut I am certain I don't have "identity" anywhere in the code.
Those are the two things in which my request differs from the browser's. Now, I am getting a 500 Internal Server Error at return, can it be caused by the lack of cookies or the second thing?
Thank you very much.
Haven't exactly tried it myself but here's an example I found about website login using indy
http://www.ciuly.com/delphi/indy/persistent-login-example-for-geocacheing-no-ssl/
Ok. Lets comment:
How can i send (or add) cookies into the request?
You should not do that. Indy handles this for you (but if really want to, there is a TidCookieManager). But it seems to me that you dont know how cookies work. Its not a thing you can add to a request. It cames from the server and it identifies you.
In the Headers I have this code: AcceptEncoding:='gzip,deflate,sdch';
AcceptEnconding tells to the server that it can compact the response using those algorithms. Indy supports gzip,deflate,sdch,identity and indy is updating que header request to add the one you forgot to put.
You should take a look at those links to learn how http works:
W3
Wikipedia

What does Indy's HandleRedirect do?

I'm having some trouble reading files with Indy from a site that has WordPress installed.
It appears that the site is configured to redirect all hits to sitename/com/wordpress.
Can I use HandleRedirect to turn that off so I can read files from the root folder?
What is the normal setting for this property? Any downsides to using it for this purpose?
(Edit: it appears that my problem may be caused by Windows cacheing of a file I've accessed before through Indy. I'm using fIDHTTP.Request.CacheControl := 'no-cache'; is that adequate?
When the server sends a 3xx result for a request, the HandleRedirects property controls whether Indy will immediately turn around and issue a new request using the new location. The alternative is that Indy will return the response code to your program. You're welcome to handle it yourself with the OnRedirect event, but if the server bothers to send anything in addition to the response code, it's unlikely to be of much use to your program. It's not as though there are hidden files that the redirection is preventing you from downloading. Set the property to true and let Indy take care of the redirection for you.
It's probably not the case that Windows is caching anything for your program. Indy doesn't use the OS cache. The Cache-Control header is an instruction to a proxy or the so-called origin server that it should not satisfy your request using a cached response without validating it with the origin server. Maybe WordPress has a cache of its own that you're by-passing.

HttpClient automatically retires before the response is received from server

I'm come upon a wierd problem with java HttpClient library.
Specifically the library automatically retries my request (POST requests)
even before the response is received from the server. Moreover the weirder problem
is that this only happens on specific hosts (machines).
So the end result is if a post request succeeds, then there may be an exact same
post request coming to the server which the server can't handle. Now, I do want
the retry behavior, but it should behave intuitively.
Anyone faced this kind of problem before, or is there a way to configure
http client to wait for a specific time before retrying. I'm not sure what going
wrong here.
Do you have a methodretryhandler set for your HttpClient? As in:
DefaultMethodRetryHandler retryhandler = new DefaultMethodRetryHandler(10, true);
client.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, retryhandler);
That is where retries would originate and you could debug and see what response headers it's receiving if any, etc.
Have you tried using a firefox http monitor or ethereal or similar to look through your http requests and responses and ensure that what you believe is happening is actually hapening?

Sending custom HTTP error information to Flash, JavaScript, etc

I'm developing a REST API at the moment, and one of the core features of this is that is uses a variety of HTTP status codes to return status/error information, some of which may be extended information (e.g. if an item is not found, some other similar items) which will be in the response body.
This is fine until you get to 'crippled' clients like Flash and JavaScript which can't access the response body or headers unless the HTTP status code is 200 OK (even a 201 Created success code can cause Flash to fail thinking it's an error).
So my question is, is there a standard way for allowing this type of client to request that all status codes are HTTP 200, and to indicate the real status code in another way?
One solution I was thinking of is, in the pattern of the HTTP Accept-* family of headers, using an X-Accept-Status extension header to specify which status codes can be handled, e.g. Flash would send...
X-Accept-Status: 200
...and then any status code not in this list would be mapped to one that is, and the error returned in the response body, possibly with another extension header indicating the real status code, e.g.
X-HTTP-Status-Code: 404 Not Found
This all seems a bit horrible, and working against the protocol, but if you have clients that cannot use the protocol property then that's unavoidable. I'm just looking for something a bit like X-HTTP-Method-Override (which is a 'standard' way of working around the protocol for clients that cannot send PUT/DELETE requests) but for clients that cannot understand status codes.
well, actually the problem with HTTP and REST is, that REST is a really good idea, and HTTP describes a really good implementation of it ... but really, many clients and servers only implement part of HTTP ...
i don't think HTTP is a must ... still, REST is a good idea and RESTfulness of a system is a powerful property ... so why not use HTTP as a stupid transport layer for a RESTful system?
this is what you are doing, although in my opinion, you are holding on a bit too much to HTTP and all it's theoretically built-in features ... do you really need to transport the information in a status code?
don't depend so much on your transport protocol/layer ... have a clear idea in mind, how your service should work ... seperate the protocol semantics from its implementation ... on both client and server ... abstract your RESTfulness and status codes too (make them more then just integers ... make it enums, or objects ... exceptions, why not?)...
and then plug-in protocols/transport layers at will ...
make a standard HTTP implementation
make a hacky one, using the solution you described (which to me seems perfectly valid ... if people are using technologies unable to use the standards, why should you bother too much finding the most standard-conform solution)
make whatever you have the time to do, and your server is able to do, binary, JSON, XML ... whatever seems adequate ...
two technical notes, though:
flash player does it's HTTP traffic over the browser ... and it simply does not get the status codes from the browser ... well it depends on the browser in fact ... the specs say, it does not work for: "Netscape, Mozilla, Safari, Opera, and Internet Explorer for the Macintosh." ... so IE for windows should be working? Chrome? I don't know ... but i think, it doesn't matter, since obviously, you cannot rely on it ... oh, and to state the most obvious: JavaScript also does its HTTP over the browser, of course ... so same problem here ...
for both this implies, that if you would succeed in finding something like X-HTTP-Method-Override for response, that is built in the protocol, a good browser would understand that, and would remap things accordingly, before deciding which information to give to JavaScript or 3rd-party plugins ... so you'd end up with nothing again ... i guess ...
you should simply choose your response method based on the client ... and maybe the client should send some extra info, if it is unable to use the HTTP standard ... otherwise throw at it, what follows the standard ... i'd first make an implementation using standard HTTP, yet hiding the HTTP itself away, and once everything works, write one using
greetz
back2dos
Am I wrong for thinking that one shouldn't let a crippled out-of-the-box potential client to the API dictate the features of the API implementation? I guess practical considerations win the day, but in general I guess my vote is in favor of building API implementations "properly" and requiring custom client-side programming as needed.
Bit late for that response, but...
When I implemented a flash client API with an early version of OpenRasta, I had X-ResponseLine that contained the response code and text, on each outgoing request.
As headers are by default only generic headers, they have no involvement in caching, so no reason to have an Accept / Vary on this.

Resources