Automatically upload assets to S3 without assets_sync - ruby-on-rails

I just did my first deploy to Heroku and besides my images, the assets work. I was reading about how to move the assets to s3 (and then cache them with cloudfront) when I found this gist:
https://gist.github.com/schneems/9374188
"I hate asset_sync"
Using asset sync can cause failures, is difficult to debug, un-needed, and adds extra complexity. Don't use it. Instead use https://devcenter.heroku.com/articles/using-amazon-cloudfront-cdn
The problem is, I can't find how to sync assets automatically like the gem does. Whats the best alternative to using the asset_sync gem?

Although an old question, in case anyone finds this question and is hoping for an answer here are my own findings.
Cloudfront has, for some time now, allowed users to set an origin value on their configuration. You want to set that to your application host. If you were deploying to a site accessible via https://myapp.com then you would use that as your Cloudfront origin. Then any cache misses from Cloudfront will route to your application layer at https://myapp.com, appending whatever path info is present in the request (e.g. /assets/css/whatever.css. This means that your application must be able to serve these static assets. If it can, then you're all set. If not, check the Rails guide for how to enable that.
Caveat! You cannot use a non-publicly-accessible URL for the origin. What does that mean? If you are provisioning your own pre-production application instances that are hidden behind a VPC, for example, then you cannot use those instances for your origin. Cloudfront cannot be granted special access to your instances. There's a workaround if you read Cloudfront's documentation on serving private content; basically you make your application publicly-accessible to anyone with the appropriate links but you enforce application-level constraints to disallow access to anyone not using a specially signed URL or cookie.

This is very old, but the answers here are not satisfying. The gist and articles linked do not intend to say the asset_sync gem is bad. The article intends to say you shouldn't be serving your assets from anywhere (S3 or not) but your application. Uploading your assets anywhere runs into the same issues mentioned in the gist, no matter if you're using this particular gem or not.
The suggestion is serving your assets from your application and then use a CDN such as Cloudfront to to ease the load on your server. You do not need to upload anything to Cloudfront. Instead, Cloudfront will ask your application for assets on-demand and cache automatically.
However, Cloudfront (or any CDN) do not address all use cases of asset_sync. If you're running into slug size issues and you've already cleaned repos and unnecessary dependencies, you might still need to use asset_sync to get your slug under 500MB and keep Heroku happy. Of course you can forego asset_sync and use a custom implementation to upload to S3 (or anywhere else).

Related

"Request has expired" when using S3 with Active Storage

I'm using ActiveStorage for the first time.
Everything works fine in development but in production (Heroku) my images disappear without a reason.
They were showing ok the first time, but now no image is displayed. In the console I can see this error:
GET https://XXX.s3.amazonaws.com/variants/Q7MZrLyoKKmQFFwMMw9tQhPW/XXX 403 (Forbidden)
If I try to visit that URL directly I get an XML
<Error>
<Code>AccessDenied</Code>
<Message>Request has expired</Message>
<X-Amz-Expires>300</X-Amz-Expires>
<Expires>2018-07-24T13:48:25Z</Expires>
<ServerTime>2018-07-24T15:25:37Z</ServerTime>
<RequestId>291D41FAC6708334</RequestId>
<HostId>lEVGuwA6Hvlm/i40PeXaje9SEBYks9+uk6DvBs=</HostId>
</Error>
This is what I have in the view
<div class="cover" style="background-image: url('<%= rails_representation_path(experience.thumbnail) %>')"></div>
This is what I have in the model
def thumbnail
self.cover.variant(resize: "300x300").processed
end
In simple words, I don't want images to expire but to be always there.
Thanks
ActiveStorage does not support non-expiring link. It uses expiring links (private), and support uploading files only as private on your service.
It was a problem for me too, and did 2 patches (caution) for S3 only, one simple ~30lines that override ActiveStorage to work only with non-expiring (public) links, and another that add an acl option to has_one_attached and has_many_attached methods.
Hope it helps.
Your question doesn't say so, but it's common to use a CDN like AWS CloudFront with a Rails app. Especially on Heroku you probably want to conserve compute power.
Here is what happens in that scenario. You render a page as usual, and all the images are requested from the asset host, which is the CDN, because that's how it is configured to integrate. Its setup to fetch anything it doesn't find in cache from origin, which is your application again.
First all image requests are passed through. The ActiveStorage controller creates signed URLs for them, and the CDN passes them on, but also caches them.
Now comes the problem. The signed URL expires in 5 minutes by default, but the CDN caches usually much longer. This is because usually you use digest assets, meaning they are invalidated not by time but by name, on any change.
The solution is simple. Increase the expiry of the signed URL to be longer than the cache's TTL. Now the cache drops the cached signed URL before it becomes invalid.
Set the URL expiry using ActiveStorage::Service.url_expires_in in 5.2 or directly in Rails.application.config.active_storage.service_urls_expire_in in an initializer see this answer for details.
To set cache TTL in CloudFront: open the AWS console, pick the distribution, open the Behavior tab, scroll down to these fields:
Then optionally issue an invalidation to force re-caching of all contents.
Keep in mind there is a security trade-off. If the image contents are private, then they don't belong into a CDN most likely, and shouldn't have long lasting temp URLs either. In that case choose a solution that exempts attachments from CDN altogether. Your application will have to handle the additional load of signing all attached assets' URLs on top of rendering the relevant page.
Further keep in mind, that this isn't necessarily a good solution, but more of a workaround. With the above setup you will cache redirects, and the heavier requests will hit your storage bucket directly. The usual scenario for CDNs is large media, not lightweight redirects. You do relieve the app of handling a lot of requests though. How much that is a valid optimization should be looked into.
I had this same issue, but after I corrected the time on my computer, the problem was resolved. It was a server time difference, that the aws servers did not recognize.
#production.rb
Change
config.active_storage.service = :local
To
config.active_storage.service = :amazon
Should match aws/amazon whatever you defined it as in storage.yml

How do CDN's work with a Rails application?

So I read this and the Rails docs regarding CDNs here and I'm still confused about a couple of things conceptually.
When do the Rails files like the controllers and models come into play?
How do you invalidate the CDN cache when an image updates?
But what about files with dynamic content say like a user's name and address. How is that handled?
This is my understanding. Please correct me if I misspeak:
First, when a request is made say to myrailsapp.com, the request first goes to the CDN because we now pointed the CNAME of myrailsapp.com to the CDN address (say it's cdnmyrailsapp.com). I guess the DNS servers understand to route those requests to the CDN. The CDN checks whether or not it has any content cached. IF it does not, I guess the CDN forwards the request to the actual server? That's when the controller for Rails is hit and a static asset or javascript file is delivered to the CDN. All future requests for that file now use the cached version on the CDN.
CDN can only serve your static assets (compiled css and js files etc), not models and controllers. (you can get this precompiled files by assets:precompile)
Your server serve all dynamic content dyrectly without CDN.
Your files are placed on CDN domain (http://c000000.cdn.rackspacecloud.com for sample), your application left on your domain (you do not need CNAME).
For images you need manually send them (fog-aws, fog gems ) on upload.

PDF caching on heroku with cloudflare

I'm having a problem getting the caching I need to work using CloudFlare.
We use CloudFlare for caching all our assets on S3 which works 100% using a separate subdomain cdn
We also use CloudFlare for our main site (hosted on Heroku) as well, e.g. www
My problem is I can't get CloudFlare to cache PDFs that are generated from our Rails app. I'm using the WickedPDF gem to dynamically generate certain PDFs for invoices, etc. I don't want to upload these as files to say S3 but we would like to have CloudFlare cache these so they don't get generated each and every time, as the time spent generating these PDFs is a little intensive.
CloudFlare is turned on and is "accelerating" for the subdomain in question and we're using SSL, but PDFs never seem to cache properly.
Is there something else we need to do to ensure these get cached? Or maybe there's another solution that would work for Heroku? (eg we can't use Page caching since it relies on the filesystem) I also checked the WickedPDF documentation so see if we could do anything else, but found nothing about expire controls.
Thanks,
We should actually cache it as long as the resources are on-domain & not being delivered through a third-party resource in some way.
Keep in mind:
1. Our caching depends on the number of requests for the resources (at least three).
2. Caching is very much data center dependent (in other words, if your site receives a lot of traffic at a data center it is going to be cached; if your site doesn't get a lot of traffic in another data center it may not cache).
I would open a support ticket if you're still having issues.

best method for serving static pages hosted on s3 in rails app on heroku

I have a rails app which is dynamic and works well on it's own, but we also have an s3 bucket which has a bunch of html pages which are constantly updated and revised.
I'm looking for an overall solution which allows me to route requests to the static files for a large number of potential pages, but also use the app for dynamic pages. None of the static pages require a user login, but all of the dynamic pages require a user login.
We are also currently using heroku to serve the application which is something else to take into consideration.
What are some methods/gems/ideas for how to serve these static pages quickly on the same domain without interfering with the rest of the app?
According to Heroku you have 3 options:
You can use page_caching. page_caching has been removed from rails4 but is still available as an external gem actionpack-page_caching but from what I see, does not work as expected in heroku (files are not persisted).
You can use cache through HTTP headers (which actually caches data on the client)
You can use Rack::Cache as a reverse proxy.
I think the best solution for your matter is the third one. Heroku suggests that you use Rack::Cache. Of course you can combine that with the HTTP headers.
You can find more details here and here

can Amazon be used to offload server of static files for a Ruby on Rails app, but still support the app's authentication & authorization?

Can one of the Amazon services (their S3 data service, or otherwise) be used to offload server of static files for a Ruby on Rails app, but still support the app's authentication & authorization?
That is such that when the user browser downloaded the initial HTML for one page of the Ruby on Rails application, when it went back for static content (e.g. an image or CSS file), that this request would be:
(a) routed directly to the Amazon service (no RoR cycles used to serve it, or bandwidth), BUT
(b) the browser request for this item (e.g. an image) would still have to go through an authentication/authorization layer based on the user model in the Ruby on Rails application - in other words to ensure not just anyone could get the image...
thanks
The answer is a yes with a but. You can use a feature of S3 that allows you to create links to secure S3 objects that has a small time to live, default is 5 minutes. This will work for any S3 object that is uploaded as private. This means that the browser will only have X seconds or whatever to request the file from S3. Example code from docs for the AWS gem:
S3Object.url_for('beluga_baby.jpg', 'marcel_molina')
You can also specify an expires_in or expires option per file. The bad thing is that you would need to create a helper for your stylesheet, image, and js links to create the proper S3 URLs.
I would recommend that you setup a domain name for your S3 bucket, like "examples3.amazonaws.com" and put all your standard image files and CSS there as public. Then set that as the asset host in your rails config. Then, only use the secure links for static files that really need it.

Resources