Using Paperclip to direct upload files to S3 - ruby-on-rails

so I've got paperclip set up with uploadify to upload things to S3. I have made my setup so that stuff gets loaded directly to S3 and then when it's done I post to my webserver the results...
All I get back is the file name and size. am I supposed to build my own processor or before_post_process method to "download" the file from S3 in order to process it? or am I missing something and uploadify should have provided me a stream with the file inside it after it was done posting to S3?
How do you guys go about direct uploads to S3 and then notifying your paperclip backed model? Do you have to pull files from the server and do post-processing on them or will paperclip handle all of that?

Here are a couple blog posts describing how to do it...
http://www.railstoolkit.com/posts/uploading-files-directly-to-amazon-s3-using-fancyupload
http://www.railstoolkit.com/posts/fancyupload-amazon-s3-uploader-with-paperclip
They use FancyUploader (which uses MooTools/Flash) to upload directly to S3, bypassing Heroku and their dreaded 30 second request timeout all together, and then use DelayedJob to queue up post-processing tasks like thumbnailing and PaperClip to do the actual processing of the files.
If I can get this working with CarrierWave, I will post up a project on GitHub to share (in a week or so once I get time)
Update:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload
I will add the post-processing example once I have time.

You can either create a processor or use the callback methods but the file will definitively be on your server before going to S3.
If you are in the callback method for example you can access it using something like:
self.file.to_file
Once that is done processing and uploading the file will be deleted from your server. You don't need to do anything to notify or post process. Paperclip will handle it.

Related

Uploading to s3, using s3 servers

Does anyone have any sample code (preferrably in rails) that uploads to s3, using s3's servers.
Again, uploading directly to s3, where the actual upload/streaming is also preformed on amazon's servers.
Requirements:
Plupload, jQuery
Idea:
Authorize Upload via your app (sign it on server-side)
Use the signed request to upload the file to S3
Notify your app that the upload is done
Check whether S3 has received the file
I posted the code as a gist at https://gist.github.com/759939, it misses commments and you might run into some issues due to missing methods (had to rip it from our codebase).
stored_file.rb contains a model for your DB. Has many of paperclips helper methods inlined (which we used before we switched to direct upload to S3).
I hope you can use it as a sample to get your stuff running.
If you are using Rails 3, please check out my sample projects:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload
To simply copy files, this is easy to use:
Smart Copy Script into S3
Amazon wrote a Ruby library for the S3 REST API. I haven't used it yet.
http://amazon.rubyforge.org/

Uploading & Unzipping files to S3 through Rails hosted on Heroku?

I'd like to be able to upload a zip file to my Rails application that contains a number of images. Then I'd like Rails to unzip that file and attach the images inside to my Photo's model via Paperclip, so that they are ultimately stored on my Amazon S3 account (configured through Paperclip).
I'd like do do this all on my Rails site hosted on Heroku, which unfortunately doesn't allow local storage of any kind (so far as I'm aware) to temporarily do the unzipping before the Paperclip parsing.
How would I do this??
I would recommend uploading directly to S3 which bypasses Heroku entirely so you're not restricted to the 30 second request timeout they enforce (which drops your uploads after that time is hit) or the 1gb /tmp directory limit. After the file is uploaded, you can make a POST to your Rails app with the file's name and location and then do your unzipping operation. If you'd like to use Paperclip for post-processing, I have attached a link below. If you end up going the route of uploading directly to S3 which offloads the work from your Rails server, please check out my sample projects:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload
Here is the link for the Paperclip post processing for an example like images:
http://www.railstoolkit.com/posts/fancyupload-amazon-s3-uploader-with-paperclip
dmagkic is correct about the rails_root/tmp. I recommend something like the following:
Upload files through heroku to S3
Setup a background job to zip the files (store the file names that you need to group)
run the BJ that downloads the files from S3, zips them, sends the zip to S3, removes the unzipped files.
That way your application will still be responsive'ish during the upload process.
If you try to upload multiple files, you COULD write to /tmp, but just make sure that all the files come across in the same post request.
Heroku does allow writing to #{RAILS_ROOT}/tmp.
But you need to take in mind that file will be there only as long as request lasts. Probably longer, but that is not guaranteed. You could try to block request while you unzip and send to S3, but you should take care of the time it takes.
It sounds to me like you need some flash uploader that can unzip and send to S3, without Heroku.

Rails: how does background file upload work?

Uploading a file in a REQUEST/RESPONSE cycle for large files is not a nice experiences for the user, because the application seems to hang during the file upload. Even more critical is that the user can abort the upload, and need to re-start the upload process later.
How can I do the upload process in the background?
There are some examples of running background tasks in rails on railscasts.com but it's not clear to me how to integrate a background job with a file upload.
On other places, I see that I need some webserver tuning for this, but then I need to ask the folks from my shared host for technical support on this?
If you are using Rails 3, please check out my sample projects which allow you to upload directly to S3 and offload the work from the app. Then you can just use delayed job to do secondary operations:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload
By the way, you can do post-processing with Paperclip & delayed_job using something like this blog post describes:
http://www.railstoolkit.com/posts/fancyupload-amazon-s3-uploader-with-paperclip

Large file download for a Rails project

One client project will be online two months later. One of the requirements changed is to support large files (10 to 15MB per RAW camera file, expected 1000 to 5000 files download per day) download worldwide for their customers. The process will be:
there is upload screen via paperclip to the rails local public folder
a hourly task to upload to web storage (S3?)
update the download url from paperclip url to the web url
Questions:
is there a gem/plug-in for this
purpose?
if no, any gem/plug-in
for S3 to recommend?
Questions about the storage provider:
is S3 recommended?
or other service to recommend?
The baseline is: the client's web server does not and will not have the bandwidth to handle the downloads.
Thanks
I don't think there is anything that will do all of this out of the box for you. Paperclip will push files sychronousy to S3 on upload, so you will need to make this ansychronous yourself.
S3 is rock-solid, I have used it in production on a number of projects. Totally recommended.
You can upload files directly to S3 which may help by reducing the double handling of the file (no longer need to upload to your app before pushing to Amazon):
http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=139&externalID=1434
The aws-s3 and delayed_job gems are probably what you want.
gem install aws-s3
S3 is popular and widely used as far as I am aware.
If you end up going the route of uploading directly to S3 which offloads the work from your Rails server and makes it asynchronous, please check out my sample projects:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload

Rails upload to s3 performance issue

I'm building an app to store files on my s3 account. I use Rails 3.0.0beta
A lot of files can be uploaded at the same time, and the cost (from a performance point of view) of an upload is quite heavy, my app will be busy handling uploads all the time!
Maybe a solution is to upload directly to s3, but I still need a submit to my app, at least to store the file's name.
I'm wondering what is the best solution?
Execute the time consuming operation asynchronously in the background with a solution like delayed job. Compatibility.
If you are using Rails 3, please check out my sample projects which allow you to upload directly to S3 and offload the work from the app. Then you can just use delayed job to do secondary operations:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload

Resources