Need an use case example for stream response in ChicagoBoss - chicagoboss

ChicageBoss controller API has this
{stream, Generator::function(), Acc0}
Stream a response to the client using HTTP chunked encoding. For each
chunk, the Generator function is passed an accumulator (initally Acc0)
and should return either {output, Data, Acc1} or done.
I am wondering what is the use case for this? There are others like Json, output. When will this stream be useful?
Can someone present an use case in real world?

Serving large files for download might be the most straight-forward use case.
You could argue that there are also other ways to serve files so that users can download them, but these might have other disadvantages:
By streaming the file, you don't have to read the entire file into memory before starting to send the response to the client. For small files, you could just read the content of the file, and return it as {output, BinaryContent, CustomHeader}. But that might become tricky if you want to serve large files like disk images.
People often suggest to serve downloadable files as static files (e.g. here). However, these downloads bypass all controllers, which might be an issue if you want things like download counters or access restrictions. Caching might be an issue, too.

Related

Zoomify .zif format bad performance

The new .zif single file format provided by Zoomify Pro seems to have some performance issues. Comparing it to the old file structure it loads the page 3 to 4 times slower and the requests that it sends exceed 50% more (Tested with the same initial image in multiple file formats).
Using the old format is not feasible for out product and we are stuck with over a minute of load time.
Has anyone encountered this issue, and are there some workarounds? The results in the internet and the official site doesn't seem to be of any help.
NOTE: Contacting the vendor hasn't led to anything yet.
Although the official site claims the zif format could handle very large image, I'm skeptical about it because the viewer tries to do everything in Javascript. The performance is entire based on the client's machine. Try opening it on a faster machine and see if it improves.
Alternative solution: You could create Deep Zoom Image tiles by using VIPS library.
More information here:
https://libvips.github.io/libvips/API/current/Making-image-pyramids.md.html
Scroll further down in the article and you'll see this snippet:
With 7.40 and later, you can use --container to set the container
type. Normally dzsave will write a tree of directories, but with
--container zip you'll get a zip file instead. Use .zip as the directory suffix to turn on zip format automatically:
$ vips dzsave wtc.tif mypyr.zip
to write a zipfile containing the tiles.
Also, checkout this tutorial:
Serve deepzoom images from a zip archive with openseadragon
https://web.archive.org/web/20170310042401/https://literarymachin.es/deepzoom-osd-server/
The community (openseadragon and vips) is much stronger over there so you'll get help when you hit a wall.
If you want to take a break from all of this and just want the images zoomable, you could use 3rd party service such as zoomable.ca or zoomo.ca. It’s free and user friendly (upload your image and embed the viewer to your site like Google Map).
ZIF format designer here... ZIF can easily handle monstrous images, up to hundreds of terabytes in size.
Without a server, of course the viewer tries to do everything, it's the only option. As a result, serving ZIF directly from a webserver will not be as performant as using an image server. But... you can DO it. Using Zoomify tile folders, speed will be faster, but you may have hundreds of thousands or millions of tiles to deal with at the server side, and transfers will be horrendously slow and error-prone.
There are always trade-offs.See zif.photo for specification.

PDF uploading malicious content vulnerability with Rails

I am implementing pdf upload using Carrierwave with Rails 4. I was asked by the client about malicious content, e.g. if someone attempts to upload a malicious file masked as a pdf. I will be restricting filetype on the frontend to 'application/pdf'. Is there anything else I need to worry about, assuming the uploaded file has a .pdf extension?
File uploads is often a security issue, since there are so many ways to get it wrong. Regarding just the issue of masking a malicious file as a PDF, checking the content type (application/pdf) is good, but not enough, since it's controlled by the client and can be modified.
Filtering on the .pdf extension is definitely advisable, but make sure you don't accept files like virus.pdf.exe.
Other filename attack techniques exist, e.g. involving null or control characters.
Consider using a file type detector to determine that the file is really a PDF document.
But that's just for restricting the file type. There are many other issues you need to be aware of when accepting file uploads.
PDF files can contain malicious code and are a common attack vector.
Make sure uploaded files are written to an appropriate directory on the server. If they aren't meant to be publicly accessible, choose a directory outside of the web root.
Restrict the maximum upload file size.
This is not a complete list by any means. Check out the Unrestricted File Upload vulnerability by OWASP for more info.
In addition to #StefanOS 's great answer, PDF files are required to start with the string:
%PDF-[VERSION]
Generally, at least often, the first couple of bytes (or more) indicate the file type - especially for executables (i.e., Windows executables, called PE files, should start - if memory serves - with "MZ").
For uploaded PDF files, opening the uploaded file and reading the first 5 bytes should always yield %PDF-.
This might be a good enough verification. for most use-cases.

MSStream - what's the point?

Bear with me on this one please.
When setting response of a WinJS.xhr response I can set it to, among other things, to 'ms-stream' or blob. I was hoping to leverage the stream concept when downloading a file in such a way that I don't have to keep the whole response in memory (video files can be huge).
However, all I can do with 'ms-stream' object is read it with an MSStreamReader. This would be great if I could say to it 'consume 1024 bytes from the stream, and 'loop' this, until stream is exhausted. However from reading the docs (haven't tried this, so correct me if I'm wrong), it appears I can only read from the stream once (e.g. readAsBlob method) and I can't set the start position. This means I need to read the whole response into memory as a blob. Which I can achieve with responseType set to 'blob' in the first place. So what is the point of MSStream anyway?
Well, it turns out that the method msDetachStream gives access to underlying stream and doesn't interrupt the download process. I initially thought that any data that was not downloaded was lost when calling this since the docs mention that MSStream object is closed.
I wrote a blog post a while back to help answer questions about MSStream and other oddball object types that you encounter in WinRT and the host for JavaScript apps. See http://www.kraigbrockschmidt.com/2013/03/22/msstream-blob-objects-html5/. Yes, you can use MSStreamReader to for some work (it's a synchronous API), but you can also pass an MSStream to URL.createObjectURL to assign it to an img.src and so forth.
With MSStream, here's some of what I wrote: "MSStream is technically an extension of this HTML5 File API that provides interop with WinRT. When you get MSStream (or Blob) objects from some HTML5 API (like an XmlHttpRequest with responseType of “ms-stream,” as you’d use when downloading a file or video, or from the canvas’ msToBlob method), you can pass those results to various WinRT APIs that accept IInputStream or IRandomAccessStream as input. To use the canvas example, the msRandomAccessStream in a blob from msToBlob can be fed into APIs in Windows.Graphics.Imaging for transform or transcoding. A video stream can be similarly worked with using the APIs in Windows.Media.Transcoding. You might also just want to write the contents of a stream to a StorageFile (that isn’t necessarily on the file system) or copy them to a buffer for encryption."
So MSStreamReader isn't the end-all. The real use of MSStream is to pass the object into WinRT APIs that accept the aforementioned interface types, which opens many possibilities.
Admittedly, this is an under-documented area, which is exactly why I wrote my series of posts under the title, Q&A on Files, Streams, Buffers, and Blobs (the initial post is on http://www.kraigbrockschmidt.com/2013/03/18/why-doesnt-storagefile-close-method/).

Is the chunking option required with plupload and asp.net MVC?

I have seen various posts where developers have opted for the chunking option to upload files, particularly large files.
It seems that if one uses the chunking option, the files are uploaded and progressively saved to disk, is this correct? if so it seems there needs to be a secondary operation to process the files.
If the config is set to allow large files, should plupload work without chunking up to the allowed file size for multiple files?
It seems that if one uses the chunking option, the files are uploaded
and progressively saved to disk, is this correct ?
If you mean "automatically saved to disk", as far as I know, it is not correct. Your MVC controller will have to handle as many requests as there are chunks, concatenate each chunk in a temp file, then rename the file after handling the last chunk.
It is handled this way in the upload.php example of plupload
if so it seems there needs to be a secondary operation to process the
files.
I'm not sure I understand this (perhaps you weren't meaning "automatically saved to disk")
If the config is set to allow large files, should plupload work
without chunking up to the allowed file size for multiple files ?
The answer is yes... and no.... It should work, then fail with some combination of browsers / plupload runtimes when size comes around 100 MB. People also seem to encounter problems to setup the config.
I handle small files (~15MB) and do not have to use chunking.
I would say that if you are to handle large files, chunking is the way to go.

Watch video in the time they are uploaded

It is possible to implement a feature that allows users to watch videos as they are uploaded to server by others. Is html 5 suitable for this task? But flash? Are there any read to go solutions, don't want to reinvent the wheel. The application will be hosted on a dedicated server.
Thanks.
Of course it is possible, the data is there isnt it?
However it will be very hard to implement.
Also I am not so into python and I am not aware of a library or service suiting your requirements, but I can cover the basics of video streaming.
I assume you are talking about video files that are uploaded and not streams. Because, for that, there are obviously thousands of solutions out there...
In the most simple case the video being uploaded is already ready to be served to your clients and has a so called "faststart atom". They are container format specific and there are sometimes a bunch of them. The most common is the moov-atom. It contains a lot of data and is very complex, however in our use case, in a nutshell, it holds the data that enables the client to begin playing the video right away using the data available from the beginning.
You need that if you have progressive download videos (youtube...), meaning where a file is served from a Webserver. You obviously have not downloaded the full file and the player already can start playing.
If the fastastart atom was not present, that would not be possible.
Sometimes it is, but the player for example cannot display a progress bar, because it doesnt know how long the file is.
Having that covered the file could be uploaded. You will need an upload solution that writes the data directly to a buffer or a file. (file will be easier...).
This is almost always the case, for example PHP creates a file in the tmp_dir. You can also specify it if you want to find the video while its being uploaded.
Well, now you can start reading that file byte by byte and print that data to a connection to another client. Just be sure not to go ahead of what has already been recieved and written. You would probaby initiate your upload with a metadata set in memory that holds the current recieved byte position and location of the file.
Anyone who requests the file after the uploaded has started can just recieve the entire file, or if the upload is not yet finished, get it from your application.
You will have to throttle the data delivery or pause it when the data becomes short. This will appear to the client almost as a "slow connection". However you will have to echo some data from time to time to prevent the connection from closing. But if your upload doesnt stall, and why shoud it?, that shouldnt be a problem.
Now if you want to have someting like on the fly transcoding of various input formats into your desired output format, things get interesting.
AFAIK ffmpeg has neat apis which lets you directly deal with datasterams.
Also handbrake is a very good tool, however you would need to take the long road using external executeables.
I am not really aware of your requirements, however if your clients are already tuned in, for example on a red 5 streaming server, feeding data into a stream should also work fine.
Yes, take a look at Qik, http://qik.com/
"Instant Video Sharing ... Videos can be viewed live (right as they are being recorded) or anytime later."
Qik provides developer APIs, including ones like these:
qik.stream.subscribe_public_recent -- Subscribe to the videos (live and recorded)
qik.user.following -- Provides the list of people the user is following
qik.stream.public_info -- Get public information for a specific video
It is most certainly to do this, but it won't be trivial. And no, I don't think that you will find an "out of the box" solution that will require little effort on your behalf.
You say you want to let:
users watch videos as they are uploaded to server by others
Well, this could be interpreted two different ways:
Do you mean that you don't want a user to have to refresh the page before seeing new videos that other users have just finished uploading?
Or do you mean that you want one user to be able to watch a partially uploaded video (aka another user is still in the process of uploading it and right now the server only contains a partial upload of the video)?
Implementing #1 wouldn't be hard at all whatsoever. You would just need an AJAX script to check for newly uploaded videos, and those videos could then be served to the user in whatever way you choose. HTML5 vs. Flash isn't really a consideration here.
The second scenario, on the other hand, would require quite a bit of effort. I am guessing that HTML5 might not be mature enough to handle this type of situation. If you are not looking
to reinvent the wheel and don't have a lot of time to dedicate to this feature than I would say that you would be out of luck. You may be able to use ffmpeg to parse partial video files and feed them to a Flash player, but I would think of this as a large task.

Resources