Rails, given an AWS S3 URL, how to use a controller to send_data / file to the requestor - ruby-on-rails

Given an Amazon S3 URL, or any URL that is a direct URL to a file. In my controller, given this URL, I want to send the user the file, whatever it is w/o redirecting.
Is this possible?

If I understand your question correctly, I don't think that's possible from your end. That's why many sites say "right click to save" or something along those lines. Some sites even have links to videos that say "click to download" but when I click the link they start streaming. These are due to MY settings (ie. the settings on the user's client). You can't control that.
If what you're trying to do is HIDE the location of a file...
Send files back to the user - Usually,
static files can be retrieved by using
the direct URL and circumventing your
Rails application. In some situations,
however, it can be useful to hide the
true location of files, particularly
if you're sending something of value
(e-books, for example). It may be
essential to only send files to logged
in users too. send_file makes it
possible. It sends files in 4096 byte
chunks, so even large files can be
sent without slowing the system down.
From an old blog post

Related

rails controller download from aws s3

I am trying to build a really easy way for my users to download audio content from aws via my website. Here is the flow:
I give the user a download link. Ex: www.mysite.com/foobar
User clicks on the link.
In my rails controller, I create an expiring aws s3 url and automatically start downloading the audio content from that url.
User's browser should ask the user whether or not to save the file or not. In the event the user accepts to save the file, I want a callback to my rails app to log that the user actually downloaded the file.
So, from a user's perspective, I want the process to be as simple as going to a url I determine, and accepting to download the file when prompted.
In the background, I want to keep the aws s3 url hidden from the user and I want to have the flexibility to write callback logic after the user accepts the download.
What is the recommended way to achieving this?
The best way to solve this is to create an S3 URL with a very short (10 minute?) lifetime and return a redirect to the S3 URL. This does expose the S3 url to the user, but isn't a vulnerability.
If you want to hide the S3 URL, you will need to proxy the download through your servers, which is expensive and consumes a worker process for long periods of time. I do not recommend this, but it is the only way to conceal the S3 resource.
Additionally, if triggering a download vs. a view is important, you need to set the Content-Disposition header to trigger an attachment download:
Content-Disposition: attachment; filename="fname.ext"

Rails implementation for securing S3 documents

I would like to protect my s3 documents behind by rails app such that if I go to:
www.myapp.com/attachment/5 that should authenticate the user prior to displaying/downloading the document.
I have read similar questions on stackoverflow but I'm not sure I've seen any good conclusions.
From what I have read there are several things you can do to "protect" your S3 documents.
1) Obfuscate the URL. I have done this. I think this is a good thing to do so no one can guess the URL. For example it would be easy to "walk" the URL's if your S3 URLs are obvious: https://s3.amazonaws.com/myapp.com/attachments/1/document.doc. Having a URL such as:
https://s3.amazonaws.com/myapp.com/7ca/6ab/c9d/db2/727/f14/document.doc seems much better.
This is great to do but doesn't resolve the issue of passing around URLs via email or websites.
2) Use an expiring URL as shown here: Rails 3, paperclip + S3 - Howto Store for an Instance and Protect Access
For me, however this is not a great solution because the URL is exposed (even for just a short period of time) and another user could perhaps in time reuse the URL quickly. You have to adjust the time to allow for the download without providing too much time for copying. It just seems like the wrong solution.
3) Proxy the document download via the app. At first I tried to just use send_file: http://www.therailsway.com/2009/2/22/file-downloads-done-right but the problem is that these files can only be static/local files on your server and not served via another site (S3/AWS). I can however use send_data and load the document into my app and immediately serve the document to the user. The problem with this solution is obvious - twice the bandwidth and twice the time (to load the document to my app and then back to the user).
I'm looking for a solution that provides the full security of #3 but does not require the additional bandwidth and time for loading. It looks like Basecamp is "protecting" documents behind their app (via authentication) and I assume other sites are doing something similar but I don't think they are using my #3 solution.
Suggestions would be greatly appreciated.
UPDATE:
I went with a 4th solution:
4) Use amazon bucket policies to control access to the files based on referrer:
http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingBucketPolicies.html
UPDATE AGAIN:
Well #4 can easily be worked around via a browsers developer's tool. So I'm still in search of a solid solution.
You'd want to do two things:
Make the bucket and all objects inside it private. The naming convention doesn't actually matter, the simpler the better.
Generate signed URLs, and redirect to them from your application. This way, your app can check if the user is authenticated and authorized, and then generate a new signed URL and redirect them to it using a 301 HTTP Status code. This means that the file will never go through your servers, so there's no load or bandwidth on you. Here's the docs to presign a GET_OBJECT request:
https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Presigner.html
I would vote for number 3 it is the only truly secure approach. Because once you pass the user to the S3 URL that is valid till its expiration time. A crafty user could use that hole the only question is, will that affect your application?
Perhaps you could set the expire time to be lower which would minimise the risk?
Take a look at an excerpt from this post:
Accessing private objects from a browser
All private objects are accessible via
an authenticated GET request to the S3
servers. You can generate an
authenticated url for an object like
this:
S3Object.url_for('beluga_baby.jpg', 'marcel_molina')
By default
authenticated urls expire 5 minutes
after they were generated.
Expiration options can be specified
either with an absolute time since the
epoch with the :expires options, or
with a number of seconds relative to
now with the :expires_in options:
I have been in the process of trying to do something similar for quite sometime now. If you dont want to use the bandwidth twice, then the only way that this is possible is to allow S3 to do it. Now I am totally with you about the exposed URL. Were you able to come up with any alternative?
I found something that might be useful in this regard - http://docs.aws.amazon.com/AmazonS3/latest/dev/AuthUsingTempFederationTokenRuby.html
Once a user logs in, an aws session with his IP as a part of the aws policy should be created and then this can be used to generate the signed urls. So in case, somebody else grabs the URL the signature will not match since the source of the request will be a different IP. Let me know if this makes sense and is secure enough.

Can we find out when a Paperclip download is complete?

I have an application where I need to know when a user's Rails/Paperclip file download is complete. My app is set up to interact with Amazon S3 and I need to run a javascript function when the user has received the completed file.
How can I do this?
Tracking weather or not the download completes is hard, especially in Javascript. There are a few blurred lines in your question which makes me think its not possible.
First, send_file passes a special header to tell the webserver telling it what to send. See the send_file docs. Rails doesn't actually send the file at all, it sets this header which tells the webserver to send the file but then returns immediately, and moves on to serve another request. To be able to track if the download completes you'll have to occupy your Rails application process sending the file and block until the user downloads it, instead of leaving that to the webserver (which is what its designed to do). This is super inefficient.
Next, how can you still be on a page to execute a javascript function if you are downloading a file? Your user clicks the file download link and is taken to wherever the file is, weather that be a send_file from Rails or a redirect to S3 or whatever, they are no longer on the page they came from. If you are thinking about the way Chrome or Firefox works where the download goes into a download manager and the user stays on the page, theres no more interaction with the server on the old page! If you want that page to be notified of download completion, then you'd need a periodic check or long poll to the server to see if the download is done.
I think you'd be better served by redirecting to the S3 file and setting a session variable to redirect the user to where you want them to go after the download is complete so that the next time they visit any page they are back in your planned flow.
Hope this helps!

Intercept request for file download and send to Rails app?

I would like to make a Rails app to create a pretty download page for any file requested either through a link or by typing the url of the file. Is there a way to intercept the request for a file in Apache or elsewhere and send it to the app so it can generate the page?
I'd also prefer not to change the url when redirecting to the app, but it doesn't really matter either way.
So to wrap up, turn this: http://files.spherecat1.com/stuff.txt, into this:
http://files.spherecat1.com/download-page-mockup.png
(Image for illustrative purposes only and may not accurately depict final product. It's fun to add disclaimers to everything.)
Do you mean an entire Rails app just for that? Seems like overkill - you can use Sinatra, a single route, and their send_file function with much greater ease. It also deploys on Apache with Passenger.
If you have control over this thing, and you only plan to offer downloads in this way and not in any other, you should probably store the files in some non-public directory and instead allow your Rails/Sinatra app to access and push the files to the user. (Again, with send_file, that's easy.) Why put a file in your web root if your users will never access it that way?

mod_xsendfile alternatives for a shared hosting service without it

I'm trying to log download statistics for .pdfs and .zips (5-25MB) in a rails app that I'm currently developing and I just hit a brick wall; I found out our shared hosting provider doesn't support mod_xsendfile. The sources I've read state that without this, multiple downloads could potentially cause a DoS issue—something I'm definitely trying to avoid. I'm wondering if there are any alternatives to this method of serving files through rails?
Well, how sensitive are the files you're storing?
If you hosted these files somewhere under your app's /public directory, you could just do a meta tag or javascript redirect to the public-facing URL of these files after your users hit some sort of controller action that will update your download statistics.
In this case, your users would probably need to get one of those "Your download should commence in a few moments" pages before the browser would start the file download.
Under this scenario, your Rails application won't be streaming the file out, your web server will, which will give you the same effect as xsendfile. On the other hand, this won't work very well if you need to control access to those downloadable files.

Resources