Alternative to X-sendfile in Apache for sending file given a URL? - ruby-on-rails

I'm writing a Rails application that serves files stored on a remote server to the end user.
In my case the files are stored on S3 but the user requests the file via the Rails-application (hiding the actual URL). If the file was on my servers local file-system, I could use the Apache header X-Sendfile to free up the Ruby process for other requests while Apache took over the task of sending the file to the client. But in my case - where the file is not on the local file-system, but on S3 - it seems that I'm forced to download it temporarily inside Rails before sending it to the client.
Isn't there a way for Apache to serve a "remote" file to the client that is not actually on the server it self. I don't mind if Apache has to download the file for this to work, as long as I don't have to tie up the Ruby process while it's going on.
Thomas, I have similar requirements/issues and I think I can answer your problem. First (and I'm not 100% sure you care for this part), hiding the S3 url is quite easy as Amazon allows you to point CNAMES to your bucket and use a custom URL instead of the amazon URL. To do that, you need to point your DNS to the correct amazon URL. When I set mine up it was similar to this: points to Then you need to create the bucket with the name of your custom URL ( in this example). How to call that URL will be different depending on which gem you use, but a word of warning was that the attachment_fu plugin I was using was incorrectly sending me to I couldn't find the setting to fix it, so a simple .sub method for the S3 portion of the plugin fixed it.
On to your other questions, to execute some rails code (like recording the hit in the db) before downloading you can simply do this:
def download
file = File.find(...
# code to record 'hit' to database
redirect_to 3Object.url_for(file.filename,
:expires_in => 3.hours)
That code will still cause the file to be served by S3, but and still give you the ability to run some ruby. (Of course the above code won't work as is, you will need to point it to the correct file and bucket and my amazon keys are saved in a config file. The above is also using the syntax for the AWS::S3 gem -
Second, the Content-Disposition: attachment issue is a bit more tricky. Hopefully, your situation is a bit more simple than mine and the following solution can work. Assuming the object 'file' (in this example) is the correct S3 object, you can set the disposition to attachment by
file.content_disposition = "attachment"
The above code can be executed after the file exists on the S3 server (unlike some other headers and permissions), which is nice and it can also be added when you upload the file (syntax depends on your plugin). I'm still trying to find a way to tell S3 to send it as an attachment and only when requested (not every time), and if you find that, please let me know your solution. I need to be able to sometimes download it and other times save embed images (for example) into HTML. I'm not using the above mentioned redirect but fortunately it seems that if you embed (such as a HTML image tag) a file with the content-disposition/attachment header, and the browser still displays the image normally (but I haven't throughly tested that across enough browsers to send it in the wild).
What is the recommended approach to parse a CSV file stored in S3?

I am using the aws-sdk gem to read a CSV file stored in AWS S3.
Referencing the AWS doc. So far I have:['AWS_BUCKET_NAME']).object(s3_key).get({ response_target: "#{Rails.root}/tmp/items.csv" })
In Pry, this returns:
output error: #<IOError: closed stream>
However, navigating to tmp/. I can see the items.csv file and it contains the right content. I am not certain wether the return value is an actual error.
My second concern. Is it fine to store temporary files in "#{Rails.root}/tmp/"?
Or should I consider another approach?
I can load the file in memory and then CSV.parse. Will this have implications if the CSV file is huge?
I'm not sure how to synchronously return a file object using the aws gem.
But I can offer some advice on the other topics you mentioned.
First of all, /tmp - I've found that saving files here is a working approach. On AWS, I've used this directory to create a local LRU cache for S3-stored images. The key thing is to preemp the situation where the file has been automatically deleted. The file needs to be refetched if this happens. By the way, Heroku has a 'read-only filesystem' but still permits you to write into /tmp.
The second part is the question of synchronously returning a file object.
While it may be possible to do this using the S3 gem, I've found success fetching it over HTTP using something like open-uri or mechanize. If it's not supposed to be a publically-available asset, you can change the permissions on S3 to restrict access to your server.

Upload file directly to S3 without need to use forms in Rails

For my Rails application, I download a bunch of files from a remote URL to my application. I would like to directly upload them to Amazon S3, without needing a form to do the upload, since I will temporarily cache the file I downloaded on the EC2 instance.
I would also like to retain the links to the files I uploaded so I can download them later.
I am essentially reposting the files I downloaded.
I looked around, but most of the solution seem to involve form uploading to S3 with a user.
Is there s direct upload solution?
You can upload directly to S3 using the AWS SDK for Ruby. The easiest way is:
require 'aws-sdk'
s3 ='us-west-2')
obj = s3.bucket('bucket-name').object('key')
Or you can find a couple other options here.
You can simply use EvaporateJS to achieve this. You can also take advantage of sending ajax request to update file name to the database after each file upload. Though javascript exposes few details your bucket is not vulnerable to hack as S3 service provide a bucket policy.
Just set the <AllowedOrigin>*</AllowedOrigin> to <AllowedOrigin></AllowedOrigin> in production mode.

rails as proxy for remote file download

I am having a rails application on e.g. . I am using a cloud storage provider for any kind of files (videos, images, ...).
No I would like to make them available for download without exposing the url of the actual storage location.
So I was thinking of a kind of proxy. A simple controller which could look like this :
data = open(params[:file])
filename = "#{RAILS_ROOT}/tmp/my_temp_file", 'r+') do |f|
send_file filename, ...options...
( code taken from a link ).
Point being is that I would have to download the file first.
So I was wondering if it would be possible to stream the file right away without downloading from the cloud storage first.
I was working on this exact issue a while ago and came to the conclusion that this would not be possible without having to download the file to your server and then pass it on to the client as you say.
I'd recommend generating a signed, expiring download link that you insert into a hidden iframe whenever a user clicks a download link on your page. In this way they will get the experience of downloading from your page, without the file making an unnecessary roundtrip to your server.

Recommendations for file server to be used with Rails application

I'm working on a Rails app that accepts file uploads and where users can modify these files later. For example, they can change the text file contents or perform basic manipulations on images such as resizing, cropping, rotating etc.
At the moment the files are stored on the same server where Apache is running with Passenger to serve all application requests.
I need to move user files to dedicated server to distribute the load on my setup. At the moment our users upload around 10GB of files in a week, which is not huge amount but eventually it adds up.
And so i'm going through a different options on how to implement the communication between application server(s) and a file server. I'd like to start out with a simple and fool-proof solution. If it scales well later across multiple file servers, i'd be more than happy.
Here are some different options i've been investigating:
Amazon S3. I find it a bit difficult to implement for my application. It adds complexity of "uploading" the uploaded file again (possibly multiple times later), please mind that users can modify files and images with my app. Other than that, it would be nice "set it and forget it" solution.
Some sort of simple RPC server that lives on file server and transparently manages files when looking from the application server side. I haven't been able to find any standard and well tested tools here yet so this is a bit more theorethical in my mind. However, the Bert and Ernie built and used in GitHub seem interesting but maybe too complex just to start out.
MogileFS also seems interesting. Haven't seen it in use (but that's my problem :).
So i'm looking for different (and possibly standards-based) approaches how file servers for web applications are implemented and how they have been working in the wild.
Use S3. It is inexpensive, a-la-carte, and if people start downloading their files, your server won't have to get stressed because your download pages can point directly to the S3 URL of the uploaded file.
"Pedro" has a nice sample application that works with S3 at
Clone the application ( git clone git:// )
Make sure that you have the right_aws gem installed.
Put your Amazon S3 credentials (API & secret) into config/s3.yml
Install the Firefox S3 plugin (
Go into Firefox S3 plugin and put in your api & secret.
Use the S3 plugin to create a bucket with a unique name, perhaps 'your-paperclip-demo'.
Edit app/models/user.rb, and put your bucket name on the second last line (:bucket => 'your-paperclip-demo').
Fire up your server locally and upload some files to your local app. You'll see from the S3 plugin that the file was uploaded to Amazon S3 in your new bucket.
I'm usually terribly incompetent or unlucky at getting these kinds of things working, but with Pedro's little S3 upload application I was successful. Good luck.
you could also try and compile a version of Dropbox (they provide the source) and ln -s that to your public/system directory so paperclip saves to it. this way you can access the files remotely from any desktop as well... I haven't done this yet so i can't attest to how easy/hard/valuable it is but it's on my teux deux list... :)
I think S3 is your best bet. With a plugin like Paperclip it's really very easy to add to a Rails application, and not having to worry about scaling it will save on headaches.

How do I copy files between buckets using s3 from a rails application?

I am currently developing a rails application that tries to copy/move videos from one bucket to another in s3. However i keep getting a proxy error 502 on my rails application. In the mongrel log it says "failed to allocate memory." Once this error occurs the application dies and we must restart is.
Seems like your code is reading the entire resource into memory, and that out-of-memories your application. A naïve way to do this (and from your description, you're doing something like this already) would be to download the file and upload it again: just download it to a local file and not into memory. However, Amazon engineers have thought ahead and provide APIs that can deal with this specific case, as well.
If you're using something like the RightAWS gem, you can use its S3Interface like so:
# With s3 being an S3 object acquired via
# Copies key1 from bucket b1 to key1_copy in bucket b2:
s3.copy('b1', 'key1', 'b2', 'key1_copy')
And if you're using the naked S3 HTTP interface, see amazon's object copy docs for a solution that uses only HTTP to copy one object from one bucket to another.
try to stream files instead of loading whole file into memory and then working with it.
for example, if you're using aws-s3 gem, do not use:
data = open(file) file_name, data, BUCKET
Use following instead: file_name, open(file), BUCKET
not sure how exactly to "stream-download" the file though.
boto works well. See this thread. Using boto, you copy the objects straight from one bucket to another, rather than downloading them to the local machine and then uploading them to another bucket.
You can copy bucket to bucket directly using the fog gem.
s3 =
s3.copy_object('source-bucket', 'source/path', 'dest-bucket', 'dest/path')
