Servlet for (large) files chunked uploading - upload

I am trying to implement a (jetty-based) servlet supporting the uploading of (large) files from a web client, where a little javascript splits some user-selected file into chunks, and sends these chunks to the server using several POSTs with appropriate Content-Range headers (the rationale of this technique is to be able to track progress, pause and resume upload).
I have come up with an HttpServlet overriding the doPost() method, which handles the Content-Range header - i.e. which writes the payload at the specified location into a file on the server.
Is there a better/recommended way to support (large) file upload in a servlet?
Is there a set of classes in jetty that does just that?
Thanks in advance

Related

How to create an upload (large, ie ~400MB) bytestream service in Vaadin?

In an earlier post from a few minutes ago, I asked a "general" question regarding creating general webservices in vaadin: How can one create webservices in Vaadin 12?
However, one specific unique case that I mainly need to support is the uploading via https of large (eg ~400MB) bytestream objects that would presumably be sent to Vaadin via an https "post" command (with the paylod being provided I presume in raw binary format as a bytestream.) I saw that Vaadin has built-in support for uploading files (which is essentially a post command of a bytestream, I presume?) and then I saw a reference to StreamReceiver here: https://vaadin.com/docs/v12/flow/advanced/tutorial-stream-resources.html
which seems to sound like a custom file importer, but I couldn't find any (simple & more-or-less complete) examples on how to use it. Ideally, a quick few lines of Java to show the "receiving" of the bytestream and a few quick lines (ideally in Java) which "posts" to the receivestream's url would be all that's needed to show how this manual upload of bytes can be accomplished in Vaadin. (In DropWizard & Jersey, I can find such examples reasonably easily, but I'm not sure how to gain that level of control in Vaadin.)
(Very very minor bonus: is there a size limit to the post command? eg, can a bytestream of over say ~4GB be sent and received?)
In Vaadin the Upload API is optimised for streaming into File (unlike handling the stream as in Servlet and JAX-RS API). One way is to first stream to a temp file and then when the file is fully on the server side, handle the data from temp file.
Alternatively you can use Flow Viritin add-on and a helper class UploadFileHandler, which give you and API where you read the contents from InputStream, in same way as with Servlet API. See a usage example is in this test.
This isn't a first time this is asked and I actually have a more verbose blog draft about this subject. I'll add a link to that once I get that published.

How to force read operation to request always the whole file

I'm currently upgrading our former webdav implementation to use IT-HIT.
In the process I noticed that read operation of a File can request the whole file or a part of it. I was wondering if there is a way to force to request always the whole file. Our webdav handles small files and there isn't much need for it.
I'm asking because in the documentation I'm using (Java client version 3.2.2420 ) I think it only specifies it for the write operation.
Thanks for your help.
The read operation is an HTTP GET request, which can contain a Range header. WebDAV clients as well as any other clients, like web browsers, can utilize GET requests to read and download file content. As part of the GET request, they can attach the Range header, specifying which part of the file content they want to get. For example, when you pause and then resume a download or when the download is broken and then restored, the Range request can be specified by the client:
GET https://webdavserv/file.ext
Range: bytes=12345-45678
To test if the server supports the Range header, the client app can send the HEAD request. If the server response contains Accept-Ranges: bytes header, the Range header is supported:
HEAD https://webdavserv/file.ext
...
Accept-Ranges: bytes
So the solution is to remove the Accept-Ranges header from the HEAD response. If the client can properly process the absence of the Accept-Ranges header, it will always request an entire file.
If you can not remove it directly from the code, in many cases you can remove or filter the header from the response before it is sent. The specific header removal code depends on your server (Java, ASP.NET, ASP.NET Core, OWIN, etc). For example for ASP.NET it will look like this:
protected void Application_PreSendRequestHeaders(object sender, EventArgs e)
{
HttpContext.Current.Response.Headers.Remove("Accept-Ranges");
}
For Java, you will need to create a filter: How do delete a HTTP response header?

Why is GZIP Compression of a Request Body during a POST method uncommon?

I was playing around with GZIP compression recently and the way I understand the following:
Client requests some files or data from a Web Server. Client also sends a header that says "Accept-Encoding,gzip"
Web Server retrieves the files or data, compresses them, and sends them back GZIP compressed to the client. The Web Server also sends a header saying "Content-Encoded,gzip" to note to the Client that the data is compressed.
The Client then de-compresses the data/files and loads them for the user.
I understand that this is common practice, and it makes a ton of sense when you need to load a page that requires a ton of HTML, CSS, and JavaScript, which can be relatively large, and add to your browser's loading time.
However, I was trying to look further into this and why is it not common to GZIP compress a request body when doing a POST call? Is it because usually request bodies are small so the time it takes to decompress the file on the web server is longer than it takes to simply send the request? Is there some sort of document or reference I can have about this?
Thanks!
It's uncommon because in a client - server relationship, the server sends all the data to the client, and as you mentioned, the data coming from the client tends to be small and so compression rarely brings any performance gains.
In a REST API, I would say that big request payloads were common, but apparently Spring Framework, known for their REST tools, disagree - they explicitly say in their docs here that you can set the servlet container to do response compression, with no mention of request compression. As Spring Framework's mode of operation is to provide functionality that they think lots of people will use, they obviously didn't feel it worthwhile to provide a ServletFilter implementation that we users could employ to read compressed request bodies.
It would be interesting to trawl the user mailing lists of tomcat, struts, jackson, gson etc for similar discussions.
If you want to write your own decompression filter, try reading this: How to decode Gzip compressed request body in Spring MVC
Alternatively, put your servlet container behind a web server that offers more functionality. People obviously do need request compression enough that web servers such as Apache offer it - this SO answer summarises it well already: HTTP request compression - you'll find the reference to the HTTP spec there too.
Very old question but I decided to resurrect it because it was my first google result and I feel the currently only answer is incomplete.
HTTP request compression is uncommon because the client can't be sure the server supports it.
When the server sends a response, it can use the Accept-Encoding header from the client's request to see if the client would understand a gzipped response.
When the client sends a request, it can be the first HTTP communication so there is nothing to tell the client that the server would understand a gzipped request. The client can still do so, but it's a gamble.
Although very few modern http servers would not know gzip, the configuration to apply it to request bodies is still very uncommon. At least on nginx, it looks like custom Lua scripting is required to get it working.
Don't do it, for no other reason than security. Firewalls have a hard or impossible time dealing with compressed input data.

How to transfer media files using Worklight

What's the correct way of transferring media (photos or movies) using Worklight Adapters?
I sent a photo via the adapter and got the error: form too large, exceed the maximum size...
I read I need to change the form size through the Jetty
but the server I'll deploy the app won't be a jetty so what shell i do?
Thanks!
Please see topic Uploading large (and binary) files to Worklight adapter.
Basically, Worklight does not have the equivalent to an HTTP POST mechanism that allows you to transfer arbitrarily large chunked data. For large files of unknown sizes (photos, video, audio) you'll need to upload the file to the server outside the Worklight adapter framework. For example you could simply post it to a web server you have configured. In my case (in the above referenced answer) I needed to create an entire client-server mechanism to negotiate a port and key, start listening on that port, then accept requests and ensure the posting client passes the key as authorization to transfer the secure data.
Hopefully IBM will provide a formal service for this in a future release.
Adapters do not work with html forms, they work with data.
You will need to convert your image to base64 and submit as a adapter invocation parameter.
Having more information regarding what exactly you're trying to achieve might be helpful.

Can ServletFileUpload.parseRequest() only be called once per request?

I'm working a custom SpringSecurityFilter for my Grails application and I'm trying to use the commons upload library to process the request. I'm able to process the request in the filter but once it gets to my controller, none of the values are available.
Can the HttpRequest only be processed once by the upload library? I'm guessing it's cleaning up the temp files. Is there a way to keep them around so they can be processed again at the controller level?
I need to interrogate a form parameter for the security (due to the client I can't add it to the http headers) but once I get the value, it seems to wipe the request for further processing.
Yes. A Request can only be parsed once.
I saw this answer on Apache's FAQ page for FileUpload.
Question: Why is parseRequest() returning no items?
Answer: "This most commonly happens when the request has already been parsed, or processed in some other way. Since the input stream has aleady been consumed by that earlier process, it is no longer available for parsing by Commons FileUpload."
Reference: http://commons.apache.org/fileupload/faq.html

Resources