YAWS webserver - how to know if successful download? - erlang

I let people download files using HTTP-GET from Yaws. I have implemented as it is done in yaws_appmod_dav.erl, and it works fine.
case file:read(Fd,PPS) of
{ok,Data} when size(Data)<PPS ->
?DEBUG("only chunk~n"),
status(200,H,{content,Mimetype,Data});
{ok,Data} ->
?DEBUG("first chunk~n"),
spawn(fun() -> deliver_rest(Pid,Fd) end),
status(200,H,{streamcontent,Mimetype,Data});
eof ->
status(200,{content,"application/octet-stream",<<>>});
{error,Reason} ->
Response = [{'D:error',[{'xmlns:D',"DAV:"}],[Reason]}],
status(500,{xml,Response})
end;
I would like to mark a successful download on the server, i.e. when the client has accepted the last package.
How do I do that?
A minor questions: In webdav-app for Yaws, yaws_api:stream_chunk_deliver is used instead yaws_api:stream_chunk_deliver_blocking when getting a file. (See row 449 in https://github.com/klacke/yaws/blob/master/src/yaws_appmod_dav.erl)
Why isn't this a problem? According to http://yaws.hyber.org/stream.yaws "Whenever the producer of the stream is faster than the consumer, that is the WWW client, we must use a synchronous version of the code. " I notice that both versions works fine, is it just the amount of memory on the server that is affected?

The HTTP protocol doesn't specify a way for the client to notify the server that a download has been successful. The client either gets the requested data confirmed by the result code 200 (or 206) or it doesn't, and in that case it gets one of the error codes. Then the client is free to re-request that data. So, there isn't a reliable way of achieving what you want.
You could record the fact that the last chunk of data has been sent to the client and assume that it has been successful, unless the client re-requests that data, in which case you can invalidate the previous assumption.
Also, please note that HTTP specification allows to request from the server any part of the data when it sends the GET request with the Range header. See an example in this fusefs-httpfs implementation and some more info in this SO post. How can you determine if the download has been successful if you don't know which GET request that uses Range header is the last one (e.g. the client may download the whole file in chunks in backward order).
This may also answer you minor question. The client controls the flow by requesting a specified range of bytes from the given file. I don't know the implementation of WebDAV protocol, but it's possible that it doesn't request the whole file at once, and so the server can deliver data in chunks and never overflow the client.
HTTP Range header is something separate from the TCP window size, which is at the TCP protocol level (HTTP is an application level protocol implemented on top of TCP). Say the client requested the whole file and the server sends it like that. It's not that the whole file has been send through the network yet. The data to be send is buffered in the kernel and send in chunks according to the TCP window size. Had the client requested only part of the data with the Range header, only that part would be buffered in the kernel.

Related

How do I ignore timeouts in erlang?

I have a server with a number of clients and each client is able to ask the server for information about the other clients. If they do so, the server have to get the information from each client and then return it to the asking client.
If two clients does this request at the same time, a deadlock might appear. The thing is that this request is done so often that the client would not have to care if it sometimes fails. How do I just ignore the timeout message that terminates everything when this problem appear?
Strict answer to your question
If you're using gen_server, then call/3 allows you to specify a timeout (and call/2 defaults to 5 seconds).
This code will either give you the gen_server's reply or the atom timeout if it failed.
Result = try gen_server:call(Target, Message, Timeout) of
Reply ->
Reply
catch
exit:{timeout, _} ->
timeout
end.
Better answer
evnu and rvirding recommended using asynchronous calls, which is a superior technique. Here are two possible ways to do this:
1. Server stores the data
Have clients periodically gen_server:cast/2 to the server to tell it their information. The server stores the latest information about each client. When a client wants to learn about its siblings, it calls gen_server:call/2 to the server.
The server call is synchronous because it doesn't need to contact any client; it's just returning the cached values.
2. Async return
The clients call gen_server:cast/2 to request data from the server. The server calls gen_server:call/2 to fetch data from each client on demand. Once the server has collected all data, it calls gen_server:cast/2 to pass the collected data back to the client that requested it.
Here, the clients are always waiting to handle requests from the server. The server calls the client synchronously, but can't deadlock because there is only one server.
3. More gen_servers
This one's hard to describe without knowing more about your code, but you could break the clients into more pieces. One piece to handle data requests and another piece to generate the requests.
Based on your description that the clients make this data request "so often", I think you should try the first method. If your clients are requesting data frequently enough, having the server collect and cache the client information will actually result in fresher data for the clients.

How do I retrieve a complete HTTP response from a web server including response body using an Indy TIdTCPClient instance?

I have a Delphi 6 application that uses an Indy TIdTCPClient instance to communicate with a web server. The reason I am not using an HTTP client directly is because the the server is an image streaming server that uses the same socket connection for receiving the command to start streaming as it does to start "pushing" images back to you. In other words, after you send it a typical HTTP POST request, it replies with an HTTP response, and immediately after that it starts sending out a stream of JPEG images.
I already know how to craft a proper POST request and send it using the TIdTCPClient WriteBuffer() method and then use the ReadBuffer() method to receive reply data. What I'd like to do instead is to send a POST request and then ask Indy to wait for a typical HTTP response including retrieving all the bytes in the response body if there is a Content-Length header variable. I of course want it to leave the JPEG frames intact that may have piled in after the HTTP response in the receive queue until I start requesting them (that is, I don't want it including any of the JPEG frames in the HTTP response to my streaming request command until I ask for them using a successive read call).
Is there a method that I can call on a TIdTCPClient that will retrieve completely a typical HTTP response with body content, and nothing else? I thought about using SendCmd() and checking the LastCmdResult property (type: TIdRFCReply) for the response, but I can't tell from the Indy documentation if it retrieves the response body content too if there is a Content-Length header variable as part of the response it returns, nor can I tell if it leaves the rest of the receive queue after the response intact.
What is the best way to accomplish this mixed mode interaction with an HTTP web server that pushes out a stream of JPEG frames right after you make the HTTP request to start streaming?
Also, if there is a clever way to have Indy split the frames using the JPEG frame WINBONDBOUDARY delimiting string, rather than accumulating blocks of data and parsing them out myself, please share that technique.
The correct way to read an HTTP response is to first read the CRLF-delimited response headers line-by-line until a blank line is encountered, aka a CRLF+CRLF sequence, then you can use those headers to decide how to read the remaining response data. The headers will tell you not only what kind of stream is being sent (via the Content-Type header), but also how the data is being framed (Content-Length, Transfer-Encoding: chunked, something specific to the particular Content-Type, etc).
To receive the headers, you can use the connection's Capture() method, setting its ADelim parameter to a blank string.
How you read the remaining data afterwards depends on the actual formatting/framing of the stream. Without knowing exactly what kind of stream you are receiving, there is no way to advise you how best to read it, as there are several different types of streaming protocols used by HTTP servers, and most of them are not standardized. Provide that information, then I/we can show you how to implement it with Indy.
You cannot use SendCmd() as the HTTP protocol does not format its responses in a way that is compatible with that method.

Implementing Acknowledge-Extension for CometD in Jetty/ASP.NET

we're using CometD 2 to achieve the connection between a central data provider and several backends consuming the data. Up to now, when one of the backends fails briefly, all messages posted in the meantime are lost. Now we heard about the "Acknowledge Extension" for CometD. It is supposed to create a server-side list of messages and delivers them when one of the clients reports to be back online. Here are some questions:
1) Does this also work with several clients?
2) The documentation (http://cometd.org/documentation/2.x/cometd-ext/ack) says: "Note that if the disconnected browser is disconnected for in excess of maxInterval (default 10s), then the client will be timed out and the unacknowledged queue discarded." -> does this mean that in case my client doesn't restore within the maxInterval, the messages are lost anyway?
Hence,
2.1) What's the maximal maxInterval? Which consequences does it have to set it to a high value?
2.2) We'd need a secure mechanism for fail outs of at least a few minutes. Is this possible? Are there any alternatives?
3) Is it really only necessary to add the two extensions in both the client and cometD server? We're using Jetty for the server and .NET Oyatel for the client. Does anyone have some experiences with this?
I'm sorry for this bunch of questions, but unfortunately, the CometD project isn't really well documented. I really appreciate any answers.
Cheers,
Chris
1) Does this also work with several Clients
Yes, it does. There is one message queue allocated for each client (see AcknowledgedMessagesClientExtension).
2) does this mean that in case my client doesn't restore within the maxInterval, the messages are lost anyway?
Yes, it does. When the client can't reach the server for maxInterval milliseconds, the server will throw away all state associated with that client.
2.1) What's the maximal maxInterval? Which consequences does it have to set it to a high value?
maxInterval is a servlet parameter of the Cometd servlet. It is internally treated as a long value, so the maximal value for it is Long.MAX_VALUE.
Example configuration:
<init-param>
<!-- The max period of time, in milliseconds, that the server will wait for
a new long poll from a client before that client is considered invalid
and is removed -->
<param-name>maxInterval</param-name>
<param-value>10000</param-value>
</init-param>
Setting it to a high value means that the server will wait longer before throwing away the state associated with a client (from the time the client stops contacting the server).
I see two problems with this. First, the memory requirements of the server will potentially be higher (which may also make denial of service easier). Second, the RemoveListener isn't called on the Server before the maxInterval expires, which may require you to implement additional logic that differentiates between "momentarily unreachable" and "disconnected".
2.2) We'd need a secure mechanism for fail outs of at least a few minutes. Is this possible? Are there any alternatives?
Yes, it is possible to configure the maxInterval to last for a few minutes.
An alternative would be to restore any server side state on every handshake. This can be achieved by adding a listener to "/meta/handshake" and publishing a message to a "/service/" channel (to make sure only the server receives the message), or by adding an additional property to the "ext" property of the handshake message. Be careful to let the client restore only valid state (sign it on the server if you must).
3) Is it really only necessary to add the two extensions in both the client and cometD server?
On the server it is sufficient to do something like:
bayeux.addExtension(new AcknowledgedMessagesExtension());
I don't know how you'd do it on Oyatel. In Javascript it suffices to simply include the extension (dojo.require or script include for jQuery).
When a client with the AckExtension connects to the server, a message similar to the following will be logged (from my Jetty console log):
[qtp959713667-32] INFO org.cometd.server.ext.AcknowledgedMessagesExtension - Enabled message acknowledgement for client 51vkuhps5qgsuaxhehzfg6yw92
Another note because it may not be obvious: the ack extension will only provide server to client delivery guarantee, not client to server. That is, when you publish a message from the client to the server, it may not reach the server and will be lost.
Once the message has made it to the server, the ack extension will ensure that all recipients connected at that time will receive the message (as long as they aren't unreachable for maxInterval milliseconds).
It is relatively straightforward to implement client-side retrying if you listen to notifications on "/meta/unsuccessful" and resend the message (the original message that failed is passed as message.request to the handler).

Transferring Data Directly between 2 Connections in Indy (TIdContext)

ive a sever running TIdTCPServer, and Client Using Web Browser (or any other software) to Communicate, i dunno the protocol, but what im trying to do is to Send The Data between the client and another Connection (Both Connected to the same TIdTCPServer) for example the data sent by the first client is transmitted to the second client, and the data sent by the second client is transmitted to the first client, like a proxy (i cant really use a proxy server since its just this one condition) and the TIdTCPServer should still be receiving other clients and processing their data.
i stumbled upon the first line of code, since TIdContext.Connection.Socket.ReadLn requires a Delimiter, and the Client's Protocol is unknown to the server.
any ideas?
thanks.
You can look at the source code for TIdMappedPortTCP and TIdHTTPProxyServer to see how they pass arbitrary data between connections in both directions. Both components use TIdSocketList.SelectReadList() to detect when either connection has data to read. TIdMappedPortTCP then uses TIdBuffer.ExtractToBytes() and TIdIOHandler.Write(TIdBytes), whereas TIdHTTPProxyServer uses TIdTCPStream and TIdBuffer.ExtractToStream() instead.

Progress feedback in stateless HTTP session

I need to program a stateless server to execute remote methods. The client uses REST with a JSON parameter to pass the method name and its parameters. After servicing the result the session is closed. I have to use Indy10, TCP/IP as protocol, and therefore look at using IdHTTPServer.
Large result sets are chunked by Indy10 and sent to the client in parts.
My problem now is:
The methods on the server provide progress information if they take longer to produce the results. These are short messages. How can I write back to the client?
So far I have used writeflush on the server, but the client waited for the request to end before handing back the full resultset, including the progress information. What can I do to display/process such progress information on the client and yet keep the connection open to receive further data on the same request?
On the client side instead of the regular HTTP client component TIdHTTP you can instead use Indy class TIdTCPClientCustom in unit IdTCPClient to send the request and process the response.
This class gives total control over the processing of the server responses. I used the TIdTelnet class as a starting point to implement a client for a message broker messaging protocol, and found it stable and reliable for both text and binary data.
In the receiving thread, the incoming data can be read up to delimiters and parsed into chunks (for the progress information) and immediately processed.

Resources