Paperclip: migrating from file system storage to Amazon S3 - ruby-on-rails

I have a RoR website, where users can upload photos. I use paperclip gem to upload the photos and store them on the server as files. I am planning to move to Amazon S3 for storing the photos. I need to move all my existing photos from server to Amazon S3. Can someone tell me the best way for moving the photos. Thanks !

You'll want to log into your AWS Console and create a bucket structure to facilitate your images. Neither S3 nor Paperclip have any tools in the way of bulk migrations from file system -> s3, you'll need to use the tool s3cmd for that. In particular, you're interested in the s3cmd sync command, something along the lines of:
s3cmd sync ./public/system/images/ s3://imagesbucket
If you have any image urls hard-coded into your database (a la markdown/template code) this might be a little tricky. One option would be to manually update your urls to point to the new bucket. Alternatively, you can rack-rewrite.

You can easily do this by creating a bucket on Amazon S3 that has the same folder structure as your public directory on your Rails app.
So say for instance, you create a new bucket on Amazon S3 called MyBucket and it has a folder in it called images. You'd just move all of your images within your Rails app's images folder over to that new bucket's images folder.
Then you can set up your app to use an asset host like this answer describes: is it good to use S3 for Rails "public/images" and there an easy way to do it?
If you are using image_tag or other tag helpers (javascripts, stylesheets, etc), then it will use that asset_host for production environments and properly generate the URL to your S3 bucket.

I found this script which takes care of moving the images to Amazon S3 bucket using rake task.
https://gist.github.com/924617

Related

Exporting existing uploaded images in Refinery to Amazon S3

I have an existing Rails RefineryCMS application, which have been running for quite some time. It has alot of image and document uploads, which always have been uploaded to the local filesystem.
But we are moving to Heroku, then this will be a problem, since Heroku doesn't persist these files.
So, we need to get all the existing images and document exported to a Amazon S3.
How could we achieve this?
Would it be plain simple as just copying over the existing files from the current production environment to the S3 bucket?
Kind regards
The plain solution was just to copy the existing folders that are generated by fileupload from Dragonfly in "app/public" to Amazon S3.

Transferring some (not all) files between S3 buckets using Paperclip

I have a Heroku-hosted app that uses Paperclip to store User photos on Amazon S3
I want to move some (not all) files to a new bucket based on some internal logic (the app is multi-tenant and I'm separating AWS file storage and my Postgres DB into separate tenants/schemas)
I have 2 options I'm considering (drawn above)
Option 1 - Use the AWS Cli to move files directly between buckets
This option is AWS native, but it has the drawback of having to worry about an entire folder structure for each file (thumbnails, etc..). Moving a file involves moving all the various styles of the file - original, medium size, thumbnail, etc.. so it's not as straightforward as copying 1 file over.
It also copies everything over to the new bucket with the exact same folder/id structure, which I'd like to avoid since the User's corresponding DB info (e.g. id) will change when I migrate them over in the postgres DB
Option 2 - Use paperclip to pull down each file locally and re-upload it
This is an attractive option because it lets paperclip handle all the work.
However, paperclip uses the bucket name to construct the URL of the file. I need it to pull from 1 bucket and push to another bucket. Is there a way to set the bucket name individually for each transaction?
Paperclip uses the bucket name to construct the URL of a remote file but the names of these directories and files doesn't depend on the bucket name. If your files or directories contains the bucket name then you are doing it wrong and you should start by fixing it.
Do the following:
Sync your public/system directory with oldbucket using the aws s3 sync OLD_BUCKET_URL public/system command
Perform the changes in the directories and files locally with a Ruby script using Paperclip
Sync (upload) your public/system directory with newbucket using the aws s3 sync public/system NEW_BUCKET_URL command.

Rake task creates CSV and then upload to S3. How?

I have a rake task that creates a CSV file. Right now I am storing this file into my /tmp folder (I am using Heroku). This CSV file is not associated with any model, it's just some data I pull from several APIs, combined with some data from my models.
I would like to download this file from Heroku, but this seems not possible. So, my question is: Which gem am I trying to look for in order to upload that file to Amazon S3? I have seen gems like Paperclip, but that seems to be associated with a model, and that is not my case. I just want to upload that CSV file that I will have in /tmp, into my Amazon S3 bucket.
Thanks
You can use aws-s3 gem
S3Object.store('filename_in_s3.txt', open("source_file.tmp"), 'bucket_name')
You should define the exact path of your tmp file, for example:
open("#{Rails.root}/tmp/source_file.tmp")
CarrierWave can directly interface your Ruby application with S3 directly via the Fog library. Rather than operating on the model level, CarrierWave utilizes a Uploader class where you can cull from across your APIs and datasources, which is precisely what you're trying to accomplish.

Uploading & Unzipping files to S3 through Rails hosted on Heroku?

I'd like to be able to upload a zip file to my Rails application that contains a number of images. Then I'd like Rails to unzip that file and attach the images inside to my Photo's model via Paperclip, so that they are ultimately stored on my Amazon S3 account (configured through Paperclip).
I'd like do do this all on my Rails site hosted on Heroku, which unfortunately doesn't allow local storage of any kind (so far as I'm aware) to temporarily do the unzipping before the Paperclip parsing.
How would I do this??
I would recommend uploading directly to S3 which bypasses Heroku entirely so you're not restricted to the 30 second request timeout they enforce (which drops your uploads after that time is hit) or the 1gb /tmp directory limit. After the file is uploaded, you can make a POST to your Rails app with the file's name and location and then do your unzipping operation. If you'd like to use Paperclip for post-processing, I have attached a link below. If you end up going the route of uploading directly to S3 which offloads the work from your Rails server, please check out my sample projects:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload
Here is the link for the Paperclip post processing for an example like images:
http://www.railstoolkit.com/posts/fancyupload-amazon-s3-uploader-with-paperclip
dmagkic is correct about the rails_root/tmp. I recommend something like the following:
Upload files through heroku to S3
Setup a background job to zip the files (store the file names that you need to group)
run the BJ that downloads the files from S3, zips them, sends the zip to S3, removes the unzipped files.
That way your application will still be responsive'ish during the upload process.
If you try to upload multiple files, you COULD write to /tmp, but just make sure that all the files come across in the same post request.
Heroku does allow writing to #{RAILS_ROOT}/tmp.
But you need to take in mind that file will be there only as long as request lasts. Probably longer, but that is not guaranteed. You could try to block request while you unzip and send to S3, but you should take care of the time it takes.
It sounds to me like you need some flash uploader that can unzip and send to S3, without Heroku.

I want to use the fleximage gem and s3 for storage, but don't want dev/qa/test envs to use s3

I have a rails app that I'm going to host on engineyard and want to store image files on s3.
But I don't know if I want all developer machines to beusing s3 for storage of all our test and dev images. Maybe it's not an issue -- but it seems like a waste to have everyone storing all our images in s3.
I've heard of some ppl who store images on s3 'hacking' dev environments to store images locally on the file system -- and then using s3 in prod only.
What are other people doing?
You could write a wrapper class around the gem that uses the file system for image storage instead of S3 for non-production environments. Then your application would use the wrapper rather than the gem directly. Or make the image store the wrapper uses a configuration option.
I use why's old Camping app, parkplace. It behaves like (most?) of the S3 API, but runs locally. The new location for it is here: http://github.com/technoweenie/parkplace

Resources