Where do I set cache information for my images? - ruby-on-rails

This is about a Rails app on Heroku that runs behind CloudFront and serves ActiveStorage images from the Bucketeer add-on.
Cache config in both the Rails app itself and CloudFront are right on target for css, js, and even key, important requests (like search results, 3rd party info fetched from APIs, etc).
What I can't figure out how to cache are the images that come from the Bucketeer add-on.
Right now the images seem to come from the Bucketeer bucket every time. They show up with no Cache TTL.
I'd like for them to be cached for up to a year both at the CloudFront level and the visitor's browser level.
Is this possible?
It seems like the Bucketeer add-on itself gives us no control over how the bucket and/or the service handles caching.
Where can I force these files to show up with caching instructions?

Thanks for sharing your findings here
Additionally, I found that S3Service accepts upload options
https://github.com/rails/rails/blob/6-0-stable/activestorage/lib/active_storage/service/s3_service.rb#L12
So you can add the following code to your storage.yml
s3:
service: S3
access_key_id: ID
secret_access_key: KEY
region: REGION
bucket: BUCKET
upload:
cache_control: 'public, max-age=31536000'
For a full list of available options refer to AWS SDK

After a lot of searching, I learned that Bucketeer does give bucket control. You just have to use AWS CLI.
Here is the link to AWS docs on CLI:
https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html
And here is the link where Bucketeer tells you how to get started with that on their service:
https://devcenter.heroku.com/articles/bucketeer#using-with-the-aws-cli
This means you can install AWS CLI, do the aws configure with the credentials Bucketeer provides, and then go on to change cache-control in the bucket directly.
AWS does not seem to have a feature for setting cache-control defaults for an entire bucket or folder, so you actually do it to each object.
In my case, all of my files/objects in the bucket are images that I display on the website and need to cache, so it's safe to run a command that does it all at once.
Such a command can be found in this answer:
How to set expires headers to all images in a bucket in Amazon S3
For me, it looked like this:
aws s3 cp s3://my-bucket-name s3://my-bucket-name --recursive --acl public-read --metadata-directive REPLACE --cache-control max-age=43200000
The command basically copies the entire bucket onto itself while adding the cache-control max-age=43200000 header to each object in the process.
This works for all existing files, but will not change anything for future changes or additions. You'd have to run this again every so often to catch new stuff and/or write code to set your object headers when saving the object to the bucket. Apparently there are people that have had luck with this. Not me.
Thankfully, I found this post:
https://www.neontsunami.com/posts/caching-variants-with-activestorage
This monkey-patch basically changes ActiveStorage::RepresentationsController#show to use Rails action caching for variants. Take a look. If you're having similar issues, it's worth the read.
There are drawbacks. For my case, they were not a problem, so this is the solution I went with.

Related

How to reference and update a file on S3 from Rails 4

I have a Rails 4 application that needs to use a number of excel files, representing rosters, (20 or so, grouped by their own individual committee) that have to be read in and editable by the User. Pre-deploy I had the system working perfectly where these files would live in public/rosters and could be referenced and edited by any authenticated user, unfortunately when I deployed to Heroku I could no longer do this.
I have been using an S3 bucket to host the other files necessary for this and other related apps, and it's been working wonderfully, for what I've been using it for; so I decided to try it as a solution to this problem. Unfortunately it would appear as if I could only access the files the way I had been by making them publicly accessible, which is not something that I want to do.
So my question is this: what would be the best way to reference these files (using my access_key_id and secret_access_key to authenticate ideally) and allow a User to push changes that will overwrite the file on the S3 bucket.
You have to use aws-sdk-ruby to write file to S3 which works using access_key_id and secret_access_key. Check this documentation. Hope this helps. Thanks!

Upload file directly to S3 without need to use forms in Rails

For my Rails application, I download a bunch of files from a remote URL to my application. I would like to directly upload them to Amazon S3, without needing a form to do the upload, since I will temporarily cache the file I downloaded on the EC2 instance.
I would also like to retain the links to the files I uploaded so I can download them later.
I am essentially reposting the files I downloaded.
I looked around, but most of the solution seem to involve form uploading to S3 with a user.
Is there s direct upload solution?
You can upload directly to S3 using the AWS SDK for Ruby. The easiest way is:
require 'aws-sdk'
s3 = Aws::S3::Resource.new(region:'us-west-2')
obj = s3.bucket('bucket-name').object('key')
obj.upload_file('/path/to/source/file')
Or you can find a couple other options here.
You can simply use EvaporateJS to achieve this. You can also take advantage of sending ajax request to update file name to the database after each file upload. Though javascript exposes few details your bucket is not vulnerable to hack as S3 service provide a bucket policy.
Just set the <AllowedOrigin>*</AllowedOrigin> to <AllowedOrigin>specificwebsite.com</AllowedOrigin> in production mode.

Recommendations for file server to be used with Rails application

I'm working on a Rails app that accepts file uploads and where users can modify these files later. For example, they can change the text file contents or perform basic manipulations on images such as resizing, cropping, rotating etc.
At the moment the files are stored on the same server where Apache is running with Passenger to serve all application requests.
I need to move user files to dedicated server to distribute the load on my setup. At the moment our users upload around 10GB of files in a week, which is not huge amount but eventually it adds up.
And so i'm going through a different options on how to implement the communication between application server(s) and a file server. I'd like to start out with a simple and fool-proof solution. If it scales well later across multiple file servers, i'd be more than happy.
Here are some different options i've been investigating:
Amazon S3. I find it a bit difficult to implement for my application. It adds complexity of "uploading" the uploaded file again (possibly multiple times later), please mind that users can modify files and images with my app. Other than that, it would be nice "set it and forget it" solution.
Some sort of simple RPC server that lives on file server and transparently manages files when looking from the application server side. I haven't been able to find any standard and well tested tools here yet so this is a bit more theorethical in my mind. However, the Bert and Ernie built and used in GitHub seem interesting but maybe too complex just to start out.
MogileFS also seems interesting. Haven't seen it in use (but that's my problem :).
So i'm looking for different (and possibly standards-based) approaches how file servers for web applications are implemented and how they have been working in the wild.
Use S3. It is inexpensive, a-la-carte, and if people start downloading their files, your server won't have to get stressed because your download pages can point directly to the S3 URL of the uploaded file.
"Pedro" has a nice sample application that works with S3 at github.com.
Clone the application ( git clone git://github.com/pedro/paperclip-on-heroku.git )
Make sure that you have the right_aws gem installed.
Put your Amazon S3 credentials (API & secret) into config/s3.yml
Install the Firefox S3 plugin (http://www.s3fox.net/)
Go into Firefox S3 plugin and put in your api & secret.
Use the S3 plugin to create a bucket with a unique name, perhaps 'your-paperclip-demo'.
Edit app/models/user.rb, and put your bucket name on the second last line (:bucket => 'your-paperclip-demo').
Fire up your server locally and upload some files to your local app. You'll see from the S3 plugin that the file was uploaded to Amazon S3 in your new bucket.
I'm usually terribly incompetent or unlucky at getting these kinds of things working, but with Pedro's little S3 upload application I was successful. Good luck.
you could also try and compile a version of Dropbox (they provide the source) and ln -s that to your public/system directory so paperclip saves to it. this way you can access the files remotely from any desktop as well... I haven't done this yet so i can't attest to how easy/hard/valuable it is but it's on my teux deux list... :)
I think S3 is your best bet. With a plugin like Paperclip it's really very easy to add to a Rails application, and not having to worry about scaling it will save on headaches.

Making files uploaded to s3 public

I'm using s3-swf-upload-plugin in a Rails project to upload directly to S3. Pretty nifty, but can't seem to figure out how to make the uploaded files public. S3 doesn't seem to have the concept of public "buckets". Any ideas?
S3 supports four different access policies for both buckets and objects.
Take a look at the Canned Access Policies section in the S3 Documentation.
Specifically:
private
public-read
public-read-write
authenticated-read
So in your case, you'll need set the access policy on your bucket and uploaded files to public-read.
I use S3Fox for Firefox, http://www.s3fox.net/
You can browse your S3 buckets then right-click -> Edit ACL and set things to public.
You can also get the url for the bucket in a similar fashion.
It is very simple to use.

How do I copy files between buckets using s3 from a rails application?

I am currently developing a rails application that tries to copy/move videos from one bucket to another in s3. However i keep getting a proxy error 502 on my rails application. In the mongrel log it says "failed to allocate memory." Once this error occurs the application dies and we must restart is.
Seems like your code is reading the entire resource into memory, and that out-of-memories your application. A naïve way to do this (and from your description, you're doing something like this already) would be to download the file and upload it again: just download it to a local file and not into memory. However, Amazon engineers have thought ahead and provide APIs that can deal with this specific case, as well.
If you're using something like the RightAWS gem, you can use its S3Interface like so:
# With s3 being an S3 object acquired via S3Interface.new
# Copies key1 from bucket b1 to key1_copy in bucket b2:
s3.copy('b1', 'key1', 'b2', 'key1_copy')
And if you're using the naked S3 HTTP interface, see amazon's object copy docs for a solution that uses only HTTP to copy one object from one bucket to another.
try to stream files instead of loading whole file into memory and then working with it.
for example, if you're using aws-s3 gem, do not use:
data = open(file)
S3Object.store file_name, data, BUCKET
Use following instead:
S3Object.store file_name, open(file), BUCKET
not sure how exactly to "stream-download" the file though.
boto works well. See this thread. Using boto, you copy the objects straight from one bucket to another, rather than downloading them to the local machine and then uploading them to another bucket.
You can copy bucket to bucket directly using the fog gem.
s3 = Fog::Storage.new(your_aws_credentials)
s3.copy_object('source-bucket', 'source/path', 'dest-bucket', 'dest/path')

Resources