best method for serving static pages hosted on s3 in rails app on heroku - ruby-on-rails

I have a rails app which is dynamic and works well on it's own, but we also have an s3 bucket which has a bunch of html pages which are constantly updated and revised.
I'm looking for an overall solution which allows me to route requests to the static files for a large number of potential pages, but also use the app for dynamic pages. None of the static pages require a user login, but all of the dynamic pages require a user login.
We are also currently using heroku to serve the application which is something else to take into consideration.
What are some methods/gems/ideas for how to serve these static pages quickly on the same domain without interfering with the rest of the app?

According to Heroku you have 3 options:
You can use page_caching. page_caching has been removed from rails4 but is still available as an external gem actionpack-page_caching but from what I see, does not work as expected in heroku (files are not persisted).
You can use cache through HTTP headers (which actually caches data on the client)
You can use Rack::Cache as a reverse proxy.
I think the best solution for your matter is the third one. Heroku suggests that you use Rack::Cache. Of course you can combine that with the HTTP headers.
You can find more details here and here

Related

Rails ActiveStorage: how to avoid one redirect for each image?

If you use ActiveStorage and you have a page with N images you get N additional requests to your Rails app (i.e. N redirects). That means wasting a lot of server resources if you have tens of images on a page.
I know that the redirect is useful for signed URLs. However I wonder why Rails does not precompute the final signed URL and embed that into the HTML page... In this way we could keep the advantages of signed URLs / protected files, without making N additional calls to the Rails server.
Is it possible to include the final URL / pre-signed URL of image variants directly in the HTML (thus avoiding the redirect)? Otherwise, why is that impossible?
After days of reasoning and tests, I am really excited of my final solution, which I explain below. This is an opinionated approach to images and may not represent the current Rails Way™️, however it has incredible advantages for websites that serve many public images, in particular:
When you serve a page with N images you don't get 1 + N requests to your app server, instead you get only 1 request for the page
The images are served through a CDN and this improves the loading time
The bucket is not completely public, instead it is protected by Cloudflare
The images are cached by Cloudflare, which greatly reduce your S3 bill
You greatly reduce the number of API requests (i.e. exists) to S3
This solution does not require large changes to Rails, and thus it is straightforward to switch back to Rails default behavior in case of problems
Here's the solution:
Create an s3 bucket and configure it to host a public website (i.e. call it storage.example.com) - you can even disable the public access at bucket level and allow access only to the Cloudflare ips using a bucket policy
Go to Cloudflare and configure a CNAME for storage.example.com that points to your domain; you need to use Flexible SSL (you can use a page rule for the subdomain); use page rules to set heavy caching: set Cache Everything and set a very long value (e.g. 1 year) for Browser Cache TTL and Edge Cache TTL
In you Rails application you can keep using private storage / acl, which is the default Rails behavior
In your Rails application call #post.variant(...).processed after every update or creation of #post; then in your views use 'https://storage.example.com/' + #post.variant(...).key' (note that we don't call processed here in the views to avoid additional checks in s3); you can also have a rake task that calls processed on each object, in case you need to regenerate the variants; this is works perfectly if you have only a few variants (e.g. 1 image / variant per post) that are changed infrequently
Most of the above steps are optional, so you can combine them based on your needs.
You can use the service_url to create direct links to your resources.
We don't use Rails views in our project so my knowledge about the view layer is rusty. I think you could put it in a dedicated helper and then use it from your views.

How do CDN's work with a Rails application?

So I read this and the Rails docs regarding CDNs here and I'm still confused about a couple of things conceptually.
When do the Rails files like the controllers and models come into play?
How do you invalidate the CDN cache when an image updates?
But what about files with dynamic content say like a user's name and address. How is that handled?
This is my understanding. Please correct me if I misspeak:
First, when a request is made say to myrailsapp.com, the request first goes to the CDN because we now pointed the CNAME of myrailsapp.com to the CDN address (say it's cdnmyrailsapp.com). I guess the DNS servers understand to route those requests to the CDN. The CDN checks whether or not it has any content cached. IF it does not, I guess the CDN forwards the request to the actual server? That's when the controller for Rails is hit and a static asset or javascript file is delivered to the CDN. All future requests for that file now use the cached version on the CDN.
CDN can only serve your static assets (compiled css and js files etc), not models and controllers. (you can get this precompiled files by assets:precompile)
Your server serve all dynamic content dyrectly without CDN.
Your files are placed on CDN domain (http://c000000.cdn.rackspacecloud.com for sample), your application left on your domain (you do not need CNAME).
For images you need manually send them (fog-aws, fog gems ) on upload.

PDF caching on heroku with cloudflare

I'm having a problem getting the caching I need to work using CloudFlare.
We use CloudFlare for caching all our assets on S3 which works 100% using a separate subdomain cdn
We also use CloudFlare for our main site (hosted on Heroku) as well, e.g. www
My problem is I can't get CloudFlare to cache PDFs that are generated from our Rails app. I'm using the WickedPDF gem to dynamically generate certain PDFs for invoices, etc. I don't want to upload these as files to say S3 but we would like to have CloudFlare cache these so they don't get generated each and every time, as the time spent generating these PDFs is a little intensive.
CloudFlare is turned on and is "accelerating" for the subdomain in question and we're using SSL, but PDFs never seem to cache properly.
Is there something else we need to do to ensure these get cached? Or maybe there's another solution that would work for Heroku? (eg we can't use Page caching since it relies on the filesystem) I also checked the WickedPDF documentation so see if we could do anything else, but found nothing about expire controls.
Thanks,
We should actually cache it as long as the resources are on-domain & not being delivered through a third-party resource in some way.
Keep in mind:
1. Our caching depends on the number of requests for the resources (at least three).
2. Caching is very much data center dependent (in other words, if your site receives a lot of traffic at a data center it is going to be cached; if your site doesn't get a lot of traffic in another data center it may not cache).
I would open a support ticket if you're still having issues.

Rails and Node in the same app on Heroku?

I'm building a Rails application that deals with file uploads through CarrierWave. Currently, larger file uploads block the server for a significant amount of time. I have seen solutions like the s3-swf-upload-plugin gem that skip the local server and send files straight from the browser to S3, but this would require some modifications for pre-generating unique filenames and synchronizing them with the database. I'm sure it wouldn't be too much trouble, but Heroku's new Cedar stack gave me the idea of offloading these long running requests to a node.js instance running in the same app. I'm not very experienced with these kinds of things, so excuse my wording if it's a bit off.
Would something like this be possible? How would you configure things such that certain requests (ones involving file uploads, in this case) would be handled by a node app bundled in the same heroku repository as the main rails app?
I don't think it's possible to mix Rails and Node in the same app. However, you could get roughly the same functionality by using two separate apps that communicate with each other.
You can use ENV['DATABASE_URL'] to determine your database connection string. Use the heroku console to set it as an ENV variable for your Node app (e.g. heroku config:add OTHER_DB=your_connection_string) should then be able to use the same connection string to connect to the same database from your other heroku app. You could even access it outside of heroku if you have a dedicated database, see: http://devcenter.heroku.com/articles/external-database-access
For seamless integration between the two apps, you could have a form rendered by the Rails app post to a URL of the Node app. In addition to the file upload, include in that form via hidden input fields any other variables you need to communicate to the Node app. When the upload to the Node app is done, it could redirect the client back to the Rails app, passing any status or variables as get parameters.
Run the two apps under two subdomains of the same domain and you could even share cookies between them.
You need two apps. I am doing exactly what's described in this question. I wanted large streaming uploads, and since Rack writes downloads to a temp file before passing them through to the handler, it is not possible to do this with Rails.
Node.js, on the other hand, does this beautifully. So there are two Heroku apps, the Rails web app and the Node.js (Express) web app. The Rails web app uses SWFUpload as the client-side solution. The Rails app and the Node.js app both have a secret key as a Heroku config variable. When it's time for the user to upload, client-side Javascript requests an upload URL from the Rails server. The Rails server forms an upload URL with an Expires parameter and computes a signature using the secret key. The client-side Javascript handler passes this URL along to SWFUpload (upload_url property). The user selects the files to upload, and SWFUpload starts posting them to the upload_url. The Node.js app verifies that the URL is not expired and that the signature is valid. It processes the form data with the formidable library.
One other detail. Flash requires the Node.js app to serve a crossdomain.xml that permits the cross-site request.
My Node.js app doesn't touch the database; but if it did I would share DATABASE_URL as previously suggested. Note that you can't share a DATABASE_URL outside of Heroku unless you have a dedicated DB. The DATABASE_URLs for shared databases are not reachable from outside Heroku (unlike some other services like RedisToGo).

can Amazon be used to offload server of static files for a Ruby on Rails app, but still support the app's authentication & authorization?

Can one of the Amazon services (their S3 data service, or otherwise) be used to offload server of static files for a Ruby on Rails app, but still support the app's authentication & authorization?
That is such that when the user browser downloaded the initial HTML for one page of the Ruby on Rails application, when it went back for static content (e.g. an image or CSS file), that this request would be:
(a) routed directly to the Amazon service (no RoR cycles used to serve it, or bandwidth), BUT
(b) the browser request for this item (e.g. an image) would still have to go through an authentication/authorization layer based on the user model in the Ruby on Rails application - in other words to ensure not just anyone could get the image...
thanks
The answer is a yes with a but. You can use a feature of S3 that allows you to create links to secure S3 objects that has a small time to live, default is 5 minutes. This will work for any S3 object that is uploaded as private. This means that the browser will only have X seconds or whatever to request the file from S3. Example code from docs for the AWS gem:
S3Object.url_for('beluga_baby.jpg', 'marcel_molina')
You can also specify an expires_in or expires option per file. The bad thing is that you would need to create a helper for your stylesheet, image, and js links to create the proper S3 URLs.
I would recommend that you setup a domain name for your S3 bucket, like "examples3.amazonaws.com" and put all your standard image files and CSS there as public. Then set that as the asset host in your rails config. Then, only use the secure links for static files that really need it.

Resources