iOS/Foundation URLSession upload infinite-loops

iOS/Foundation URLSession upload infinite-loops - ios

I'm first going to ask this without full listings and logs because it feels like the sort of thing that people might recognize generically from their own work.
iOS 16 simulator, Xcode 14.2.
Aim
I want to upload a file to a REST server. I'm using URLSessionUploadTask. HTTP Basic authentication goes through (by the low-level fact that once I provide basic creds, URLSession stops asking).
I can assume that the bytes are getting there: My task delegate's urlSession(_:task:didSendBodyData:... is called with the right number of bytes, and equal to the expected number of bytes. I assume that's not a count of what was cast into the net, but the product of some client/server acknowledgment.
The minor odd thing is that I see in my logs is first the server-trust auth challenge, then the did-send, and only then HTTPBasic.
∞ Loop
The major odd thing is:
didReceive challenge: NSURLAuthenticationMethodServerTrust
didSend: 296 / 296
didReceive challenge: NSURLAuthenticationMethodHTTPBasic
didReceive challenge: NSURLAuthenticationMethodServerTrust
didSend: 592 / 592
didReceive challenge: NSURLAuthenticationMethodHTTPBasic
... and so on, ad infinitum, accumulating by multiples of the total data. I admit I have not checked the arithmetic on payload size, I suspect that the count is not necessarily common-sensical. If you have some experience to which the counts are critical, I'm glad to hear from you.
The delegate methods for end-of-transfer, success or failure, are never called. The closure argument for URLSession.shared.dataTask... is never called.
The server's listing page does not show the file present.
Supplement: Multipart
Content-Type: multipart/form-data; boundary=593FBDC3-7A99-415D-B6B4-3F553CB6C9C2
--Boundary-593FBDC3-7A99-415D-B6B4-3F553CB6C9C2
Content-Disposition: form-data; name="file"; filename="InputSample.zip"
Content-Type: application/zip
0123456
--Boundary-593FBDC3-7A99-415D-B6B4-3F553CB6C9C2--
The linebreaks I intend are \r\n, per the general standard. "0123456" is a part of this package as Data containing that string. I wonder if the promise of .zip content without actual .zip-formatted data is a problem. I hadn't thought J. Random Apache would be that "helpful."
Oh, and:
My upload task calls .resume() once and only once. Instruments shows no hotspot or deep stack in my code, which I'd expect in a coded infinite loop.

In terms of why you are seeing a request retried, that is a product of an authentication challenge system. When we receive a challenge, we provide authentication credentials, and URLSession will automatically retry the request with the supplied credentials. It will do this with every authentication challenge, unless we explicitly tell it to cancel.
So, that having been said, I have two observations:
Given the ping-ponging between different NSURLAuthenticationMethod values, I suspect there is likely a disconnect between what authentication system the client challenge handler prepared and what the server requested.
The infinite loop is likely a result of the client challenge handler not disambiguating between an initial challenge (in which case you should supply the credentials) and the server rejecting the previously supplied credentials (in which case you should just cancel and report the error in the UI).
To answer this definitively, we would need to see:
the client authentication challenge handler code; and
details about which type of authentication has actually been set up on your server.
But one could easily end up with the pattern you describe if the client ignores the specific authentication type being requested (presumably not “basic”), but proceeds to supply “basic” authentication credentials nonetheless.
Make sure the client’s challenge handler is looking at the protectionSpace, identifying what sort of challenge it is receiving, and that it prepares an authentication response of the appropriate authentication scheme.
Also, you will want to differentiate between a challenge that is requesting credentials for authentication (the first time the delegate method is called) and an authentication that failed (a second, subsequent call to the delegate method). We should provide credentials in response to the former, but cancel in the case of the latter.
See Handling an Authentication Challenge.

Related

Golang http automatically retrying on StatusRequestTimeout (408) response

Setup
I have two different applications, both written in Go. The first is the server, and second is a smaller app that makes calls to the server. They use the http package for making calls and the router package for setting up endpoints.
The Problem
When the device makes a specific call to the server a 408 (StatusRequestTimeout) response is returned. This response is not due to our server actually timing out, but just used to describe the error (more info on this below). The first time the device makes this call it receives the 408 and proceeds normally. However, if the same call is made again a new 'third' call being sent to the server immediately after the second call has finished. This third call is the same as the first two. There is no http retry logic enabled for this call.
The bug
Why is this third call being issued? When the code is updated to instead return a 400 status response instead of a 408 this third call is no longer made. Additionally changing different calls to return a 408 instead of a 400 will start to exhibit the same behavior of sending a triplicate third call. I have been unable to find documentation to explain this behavior, or other articles which describe it.
Hunch
I have found many articles like this one which indicate browsers will sometimes retry requests. Additionally some other stackoverflow posts like this indicate that the http request doesn't retry without setting up our own retry logic. Again, we have set this up, but it is not enabled for this given call, and debugging shows that we do not ever enter our custom retry logic.
I believe that this is a chromium feature. I've tried to replicate this with firefox, but I haven't been successful, however Edge exhibits the same behavior. Chromes dev tools (and edge) however only show two network calls, the first and the third. I think it could also be the http library, but it is very strange that the behavior is different between browsers.
Bug Fix
Given the nature of what a 408 response is supposed to entail I have decided to move away from using them for custom error responses. At this point, I'm just more curious about why the behavior is as it is, if my hunch is correct, or if there is something else at play.

Lets start from method is408Message() which is here. It checks that the buffer carries 408 Request timeout status code. This method is used by another method to inspect the response from server and in case of 408 Request Timeout the persistConn is closed with an errServerClosedIdle error. The error is assigned to persistConn.closed field.
In the main loop of http Transport, there is a call to persistConn.roundTrip here which as an error returns the value stored in persistConn.closed field. Few lines below you can find a method called pconn.shouldRetryRequest which takes as an argument the error returned by persistConn.roundTrip and returns true when the error is errServerClosedIdle. Since the whole operation is wrapped by the for loop the request will be sent again.
It could be valuable for you to analyze shouldRetryRequest method because there are multiple conditions which must be met to retry the request. For example the request will not be repeated when the connection was used for the first time.

Can one differentiate cases when returning an HTTP 422 in a REST API?

I am developing a REST API in Rails.
The API returns an HTTP 422 unprocessable entity with error messages when model validations fail.
However, a model can have several validations and I want to delegate the translation of the error messages to the API consumer and that is why it needs to differentiate what was the specific cause for the server to return a 422.
I was thinking about using subcodes, just like Facebook does in its API. Is there a way to do this keeping the REST practices?
Also, what does one do when an error 422 occurs for multiple causes at the same time?

RFC 7231
Client Error 4.x.x
Except when responding to a HEAD request, the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
Normally, you should encode information that is specific to your domain in the message-body of the response. The status line and response headers are there for generic components (browsers, caches, proxies) to have a coarse understanding of what is going on.
The Problem Details specification lays out the concern rather well.
consider a response that indicates that the client's account doesn't have enough credit. The 403 Forbidden status code might be deemed most appropriate to use, as it will inform HTTP generic software (such as client libraries, caches, and proxies) of the general semantics of the response.
However, that doesn't give the API client enough information about why the request was forbidden, the applicable account balance, or how to correct the problem. If these details are included in the response body in a machine-readable format, the client can treat it appropriately; for example, triggering a transfer of more credit into the account.
I don't promise that Problem Details is well suited for your purposes; but as prior art it should help you to recognize that the information you want to communicate belongs in the body of the response, with a suitable Content-Type header to inform the consumers which processing logic they need to use.

How can I send response headers before response body in Rails

Is it possible to generate response headers and send them back to the client without the body?
send_headers
do stuff
render body

No, you cannot respond with headers to the client, perform an operation, and then respond with a body. (I'm not positive this is what you are asking)
If you want to respond to the client and then perform some operation, you could use a background processor, like Sidekiq, to perform the logic after responding to the user but you won't be able to respond again with a body.

To answer the question directly, the headers are a part of the response, so unless you were sending a HEAD request which would only return the headers, you're stuck waiting for the entire response to come back.
To answer the question of long timeouts, there is a common pattern used for dealing with long requests which involves connection polling and the 202 Accepted response code.
You should engineer an endpoint solution that sends a 202 Accepted response straight away and sets the processing chain in motion. With that, you can create a resource which can give a useful estimate of how long the request will take, and where the result will be, and send that in the body of the response.
Your ultimate goal should be to figure out why the request takes so long, but if it is ultimately designed to be a long and arduous response, either due to I/O or CPU time needed, or if it's a business requirement, then using 202 Accepted and setting up a form of connection polling would be your best option.

YAWS webserver - how to know if successful download?

I let people download files using HTTP-GET from Yaws. I have implemented as it is done in yaws_appmod_dav.erl, and it works fine.
case file:read(Fd,PPS) of
{ok,Data} when size(Data)<PPS ->
?DEBUG("only chunk~n"),
status(200,H,{content,Mimetype,Data});
{ok,Data} ->
?DEBUG("first chunk~n"),
spawn(fun() -> deliver_rest(Pid,Fd) end),
status(200,H,{streamcontent,Mimetype,Data});
eof ->
status(200,{content,"application/octet-stream",<<>>});
{error,Reason} ->
Response = [{'D:error',[{'xmlns:D',"DAV:"}],[Reason]}],
status(500,{xml,Response})
end;
I would like to mark a successful download on the server, i.e. when the client has accepted the last package.
How do I do that?
A minor questions: In webdav-app for Yaws, yaws_api:stream_chunk_deliver is used instead yaws_api:stream_chunk_deliver_blocking when getting a file. (See row 449 in https://github.com/klacke/yaws/blob/master/src/yaws_appmod_dav.erl)
Why isn't this a problem? According to http://yaws.hyber.org/stream.yaws "Whenever the producer of the stream is faster than the consumer, that is the WWW client, we must use a synchronous version of the code. " I notice that both versions works fine, is it just the amount of memory on the server that is affected?

The HTTP protocol doesn't specify a way for the client to notify the server that a download has been successful. The client either gets the requested data confirmed by the result code 200 (or 206) or it doesn't, and in that case it gets one of the error codes. Then the client is free to re-request that data. So, there isn't a reliable way of achieving what you want.
You could record the fact that the last chunk of data has been sent to the client and assume that it has been successful, unless the client re-requests that data, in which case you can invalidate the previous assumption.
Also, please note that HTTP specification allows to request from the server any part of the data when it sends the GET request with the Range header. See an example in this fusefs-httpfs implementation and some more info in this SO post. How can you determine if the download has been successful if you don't know which GET request that uses Range header is the last one (e.g. the client may download the whole file in chunks in backward order).
This may also answer you minor question. The client controls the flow by requesting a specified range of bytes from the given file. I don't know the implementation of WebDAV protocol, but it's possible that it doesn't request the whole file at once, and so the server can deliver data in chunks and never overflow the client.
HTTP Range header is something separate from the TCP window size, which is at the TCP protocol level (HTTP is an application level protocol implemented on top of TCP). Say the client requested the whole file and the server sends it like that. It's not that the whole file has been send through the network yet. The data to be send is buffered in the kernel and send in chunks according to the TCP window size. Had the client requested only part of the data with the Range header, only that part would be buffered in the kernel.

HTTP disconnect/timeout between request and response handling

Assume following scenario:
Client is sending HTTP POST to server
Request is valid and
have been processed by server. Data has been inserted into database.
Web application is responding to client
Client meets timeout
and does not see HTTP response.
In this case we meet situation where:
- client does not know if his data was valid and been inserted properly
- web server (rails 3.2 application) does not show any exception, no matter if it is behind apache proxy or not
I can't find how to handle such scenario in HTTP documentation. My question are:
a) should client expect that his data MAY be processed already? (so then try for example GET request to check if data has been submitted)
b) if not (a) - should server detect it? is there possibility to do it in rails? In such case changes can be reversed. In such case i would expect some kind of expection from rails application but there is not...

HTTP is a stateless protocol: Which means by definition you cannot know on the client side that the http-verb POST has succeeded or not.
There are some techniques that web applications use to overcome this HTTP 'feature'. They include.
server side sessions
cookies
hidden variables within the form
However, none of these are really going to help with your issue. When I have run into these types of issues in the past they are almost always the result of the server taking too long to process the web request.
There is a really great quote to that I whisper to myself on sleepless nights:
“The web request is a scary place, you want to get in and out as quick
as you can” - Rick Branson
You want to be getting into and out of your web request in 100 - 500 ms. You meet those numbers and you will have a web application that will behave well/play well with web servers.
To that end I would suggest that you investigate how long your post's are taking and figure out how to shorten those requests. If you are doing some serious processing on the server side before doing dbms inserts you should consider handing those off to some sort of tasking/queuing system.
An example of 'serious processing' could be some sort of image upload, possibly with some image processing after the upload.
An example of a tasking and queuing solution would be: RabbitMQ and Celery
An example solution to your problem could be:
insert a portion of your data into the dbms ( or even faster some NoSQL solution )
hand off the expensive processing to a background task.
return to the user/web-client. ( even tho in the background the task is still running )
listen for the final response with ( polling, streaming or websockets) This step is not a trivial undertaking but the end result is well worth the effort.
Tighten up those web request and it will be a rare day that your client does not receive a response.
On that rare day that the client does not receive the data: How do you prevent multiple posts... I don't know anything about your data. However, there are some schema related things that you can do to uniquely identify your post. i.e. figure out on the server side if the data is an update or a create.
This answer covers some of the polling / streaming / websockets techniques you can use.

You can handle this with ajax and jQuery as the documentation of complete callback explains below:
Complete
Type: Function( jqXHR jqXHR, String textStatus )
A function to be called when the request finishes (after success and error callbacks are executed). The function gets passed two arguments: The jqXHR (in jQuery 1.4.x, XMLHTTPRequest) object and a string categorizing the status of the request ("success", "notmodified", "error", "timeout", "abort", or "parsererror").
Jquery ajax API
As for your second question, is their away to handle this through rails the answer is no as the timeout is from the client side and not the server side however to revert the changes i suggest using one of the following to detect is the user still online or not
http://socket.io/
websocket-rails

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart