How to force read operation to request always the whole file - ithit-webdav-server

I'm currently upgrading our former webdav implementation to use IT-HIT.
In the process I noticed that read operation of a File can request the whole file or a part of it. I was wondering if there is a way to force to request always the whole file. Our webdav handles small files and there isn't much need for it.
I'm asking because in the documentation I'm using (Java client version 3.2.2420 ) I think it only specifies it for the write operation.
Thanks for your help.

The read operation is an HTTP GET request, which can contain a Range header. WebDAV clients as well as any other clients, like web browsers, can utilize GET requests to read and download file content. As part of the GET request, they can attach the Range header, specifying which part of the file content they want to get. For example, when you pause and then resume a download or when the download is broken and then restored, the Range request can be specified by the client:
GET https://webdavserv/file.ext
Range: bytes=12345-45678
To test if the server supports the Range header, the client app can send the HEAD request. If the server response contains Accept-Ranges: bytes header, the Range header is supported:
HEAD https://webdavserv/file.ext
...
Accept-Ranges: bytes
So the solution is to remove the Accept-Ranges header from the HEAD response. If the client can properly process the absence of the Accept-Ranges header, it will always request an entire file.
If you can not remove it directly from the code, in many cases you can remove or filter the header from the response before it is sent. The specific header removal code depends on your server (Java, ASP.NET, ASP.NET Core, OWIN, etc). For example for ASP.NET it will look like this:
protected void Application_PreSendRequestHeaders(object sender, EventArgs e)
{
HttpContext.Current.Response.Headers.Remove("Accept-Ranges");
}
For Java, you will need to create a filter: How do delete a HTTP response header?

Related

WebView2: Is it possible to prevent a cookie in a response from being stored

I am using WebView2 and am looking to stop cookies from being stored when they are received in responses to third-party resource requests.
WebView2 exposes the CoreWebView2.WebResourceResponseReceived event which initially looked promising. However, the documentation states:
There is no guarantee about the order in which the WebView processes the response and the host app's handler runs. The app's handler will not block the WebView from processing the response.
Hence it is not possible to modify the response or delete the cookie in this event handler. I guess you could record the response and delete it 'later', but this seems like it could be awkward to do reliably.
Is there a way to block or reliably delete cookies received in a response when using WebView2?
There's currently no way to intercept and modify web responses.
I imagine as a workaround you might try like you suggest of running some code asynchronously later like during the corresponding NavigationCompleted event to remove the cookie using the CoreWebView2.CookieManager APIs.
Another work around might be to use the WebResourceRequested event to intercept requests, use the GetDeferral method on the eventargs to get a deferral while you perform the web request yourself in native code, receive the response in native code, modify the response as you like, and then provide that modified response back in the WebResourceRequested eventargs and complete the deferral. However this has the drawback that you would need to convert the WebView2s web resource request and response objects back and forth between the request and response objects of whichever HTTP stack you use.
Otherwise, you can file your feedback as a feature request on the WebView2 Feedback github project.

Why is GZIP Compression of a Request Body during a POST method uncommon?

I was playing around with GZIP compression recently and the way I understand the following:
Client requests some files or data from a Web Server. Client also sends a header that says "Accept-Encoding,gzip"
Web Server retrieves the files or data, compresses them, and sends them back GZIP compressed to the client. The Web Server also sends a header saying "Content-Encoded,gzip" to note to the Client that the data is compressed.
The Client then de-compresses the data/files and loads them for the user.
I understand that this is common practice, and it makes a ton of sense when you need to load a page that requires a ton of HTML, CSS, and JavaScript, which can be relatively large, and add to your browser's loading time.
However, I was trying to look further into this and why is it not common to GZIP compress a request body when doing a POST call? Is it because usually request bodies are small so the time it takes to decompress the file on the web server is longer than it takes to simply send the request? Is there some sort of document or reference I can have about this?
Thanks!
It's uncommon because in a client - server relationship, the server sends all the data to the client, and as you mentioned, the data coming from the client tends to be small and so compression rarely brings any performance gains.
In a REST API, I would say that big request payloads were common, but apparently Spring Framework, known for their REST tools, disagree - they explicitly say in their docs here that you can set the servlet container to do response compression, with no mention of request compression. As Spring Framework's mode of operation is to provide functionality that they think lots of people will use, they obviously didn't feel it worthwhile to provide a ServletFilter implementation that we users could employ to read compressed request bodies.
It would be interesting to trawl the user mailing lists of tomcat, struts, jackson, gson etc for similar discussions.
If you want to write your own decompression filter, try reading this: How to decode Gzip compressed request body in Spring MVC
Alternatively, put your servlet container behind a web server that offers more functionality. People obviously do need request compression enough that web servers such as Apache offer it - this SO answer summarises it well already: HTTP request compression - you'll find the reference to the HTTP spec there too.
Very old question but I decided to resurrect it because it was my first google result and I feel the currently only answer is incomplete.
HTTP request compression is uncommon because the client can't be sure the server supports it.
When the server sends a response, it can use the Accept-Encoding header from the client's request to see if the client would understand a gzipped response.
When the client sends a request, it can be the first HTTP communication so there is nothing to tell the client that the server would understand a gzipped request. The client can still do so, but it's a gamble.
Although very few modern http servers would not know gzip, the configuration to apply it to request bodies is still very uncommon. At least on nginx, it looks like custom Lua scripting is required to get it working.
Don't do it, for no other reason than security. Firewalls have a hard or impossible time dealing with compressed input data.

YAWS webserver - how to know if successful download?

I let people download files using HTTP-GET from Yaws. I have implemented as it is done in yaws_appmod_dav.erl, and it works fine.
case file:read(Fd,PPS) of
{ok,Data} when size(Data)<PPS ->
?DEBUG("only chunk~n"),
status(200,H,{content,Mimetype,Data});
{ok,Data} ->
?DEBUG("first chunk~n"),
spawn(fun() -> deliver_rest(Pid,Fd) end),
status(200,H,{streamcontent,Mimetype,Data});
eof ->
status(200,{content,"application/octet-stream",<<>>});
{error,Reason} ->
Response = [{'D:error',[{'xmlns:D',"DAV:"}],[Reason]}],
status(500,{xml,Response})
end;
I would like to mark a successful download on the server, i.e. when the client has accepted the last package.
How do I do that?
A minor questions: In webdav-app for Yaws, yaws_api:stream_chunk_deliver is used instead yaws_api:stream_chunk_deliver_blocking when getting a file. (See row 449 in https://github.com/klacke/yaws/blob/master/src/yaws_appmod_dav.erl)
Why isn't this a problem? According to http://yaws.hyber.org/stream.yaws "Whenever the producer of the stream is faster than the consumer, that is the WWW client, we must use a synchronous version of the code. " I notice that both versions works fine, is it just the amount of memory on the server that is affected?
The HTTP protocol doesn't specify a way for the client to notify the server that a download has been successful. The client either gets the requested data confirmed by the result code 200 (or 206) or it doesn't, and in that case it gets one of the error codes. Then the client is free to re-request that data. So, there isn't a reliable way of achieving what you want.
You could record the fact that the last chunk of data has been sent to the client and assume that it has been successful, unless the client re-requests that data, in which case you can invalidate the previous assumption.
Also, please note that HTTP specification allows to request from the server any part of the data when it sends the GET request with the Range header. See an example in this fusefs-httpfs implementation and some more info in this SO post. How can you determine if the download has been successful if you don't know which GET request that uses Range header is the last one (e.g. the client may download the whole file in chunks in backward order).
This may also answer you minor question. The client controls the flow by requesting a specified range of bytes from the given file. I don't know the implementation of WebDAV protocol, but it's possible that it doesn't request the whole file at once, and so the server can deliver data in chunks and never overflow the client.
HTTP Range header is something separate from the TCP window size, which is at the TCP protocol level (HTTP is an application level protocol implemented on top of TCP). Say the client requested the whole file and the server sends it like that. It's not that the whole file has been send through the network yet. The data to be send is buffered in the kernel and send in chunks according to the TCP window size. Had the client requested only part of the data with the Range header, only that part would be buffered in the kernel.

How can I give the response file of a universal HTTP request a unique name?

I've developed an HTTP API Server (intended to be called by third-party applications, not necessarily by a web browser) which has one universal call to get (download) any and all types of files by passing a name parameter in the query string for the file requested. All calls, no matter for which file, are handled in the same custom request handler of mine called Get (not to be confused with the standard HTTP get). A query string includes a property name which identifies the unique file to get.
So a request may look like:
http://MyServerURL.com/Get?Key=SomeAPIKeyForAuthentication&Name=SomeUniqueIdentifier
First of all, I know I can obviously make the server fetch a file using only the URI, for example...
http://MyServerURL.com/SomeUniqueIdentifier?Key=SomeAPIKeyForAuthentication
...but the design is specifically meant to use this one universal get command, so I need to keep this unique identifier in the query string. The actual applications which connect to this API will never need to know this filename, but there may be an event when a URL is manually provided for someone to open in their browser to download a file.
However, whenever a file is downloaded through a web browser, since the call is get, the saved filename also winds up being just get.
Is there any trick in HTTP which I can implement on my server which will force the downloaded filename to be the unique identifier, rather than just get? For example, some method such as using re-direct?
I'm using Indy 10 TIdHTTPWebBrokerBridge in Delphi XE2 as the web server. I'm looking for a way in this component (technically in its corresponding TWebModule handler) when it handles this get request, to make the response's filename whatever string I want (in this case, SomeUniqueIdentifier). I've heard the term "URL Re-writing" but that's a rather different topic, and don't think it's what I need, yet it might.
That seems to a rather long winded way of saying you want to set the filename for an HTTP download indpendently of the URL used to fetch it. In which case you simply send a Content-Dispositon header specifying the desired filename. See section 19.5.1 of rfc 2616
e.g.
Content-Disposition: attachment; filename="stackoverlow.ans"

How do I retrieve a complete HTTP response from a web server including response body using an Indy TIdTCPClient instance?

I have a Delphi 6 application that uses an Indy TIdTCPClient instance to communicate with a web server. The reason I am not using an HTTP client directly is because the the server is an image streaming server that uses the same socket connection for receiving the command to start streaming as it does to start "pushing" images back to you. In other words, after you send it a typical HTTP POST request, it replies with an HTTP response, and immediately after that it starts sending out a stream of JPEG images.
I already know how to craft a proper POST request and send it using the TIdTCPClient WriteBuffer() method and then use the ReadBuffer() method to receive reply data. What I'd like to do instead is to send a POST request and then ask Indy to wait for a typical HTTP response including retrieving all the bytes in the response body if there is a Content-Length header variable. I of course want it to leave the JPEG frames intact that may have piled in after the HTTP response in the receive queue until I start requesting them (that is, I don't want it including any of the JPEG frames in the HTTP response to my streaming request command until I ask for them using a successive read call).
Is there a method that I can call on a TIdTCPClient that will retrieve completely a typical HTTP response with body content, and nothing else? I thought about using SendCmd() and checking the LastCmdResult property (type: TIdRFCReply) for the response, but I can't tell from the Indy documentation if it retrieves the response body content too if there is a Content-Length header variable as part of the response it returns, nor can I tell if it leaves the rest of the receive queue after the response intact.
What is the best way to accomplish this mixed mode interaction with an HTTP web server that pushes out a stream of JPEG frames right after you make the HTTP request to start streaming?
Also, if there is a clever way to have Indy split the frames using the JPEG frame WINBONDBOUDARY delimiting string, rather than accumulating blocks of data and parsing them out myself, please share that technique.
The correct way to read an HTTP response is to first read the CRLF-delimited response headers line-by-line until a blank line is encountered, aka a CRLF+CRLF sequence, then you can use those headers to decide how to read the remaining response data. The headers will tell you not only what kind of stream is being sent (via the Content-Type header), but also how the data is being framed (Content-Length, Transfer-Encoding: chunked, something specific to the particular Content-Type, etc).
To receive the headers, you can use the connection's Capture() method, setting its ADelim parameter to a blank string.
How you read the remaining data afterwards depends on the actual formatting/framing of the stream. Without knowing exactly what kind of stream you are receiving, there is no way to advise you how best to read it, as there are several different types of streaming protocols used by HTTP servers, and most of them are not standardized. Provide that information, then I/we can show you how to implement it with Indy.
You cannot use SendCmd() as the HTTP protocol does not format its responses in a way that is compatible with that method.

Resources