Caching dynamic images rails - ruby-on-rails

In my web app each user will be having a profile image those images are stored in Amazon s3. If the user signs in i need to show that image and that will stay in the side bar in all the pages he enters. Once he signs in is there anyway i can cache the image so that i no need to get back from Amazon s3 every time ? when he again updates the image i need to clear the cache.

You can use standard Http Caching for this.
You should set the Cache-Control and/or expires headers depending on your needs.
All the major S3 clients support setting these headers or you can set using the S3 API's or SDKs/Libraries etc.
In order to re-download the image if it has changed, you can add a querystring to the url.
eg
http://mypath/myfile.ext?v=1
http://mypath/myfile.ext?v=2

Related

"Request has expired" when using S3 with Active Storage

I'm using ActiveStorage for the first time.
Everything works fine in development but in production (Heroku) my images disappear without a reason.
They were showing ok the first time, but now no image is displayed. In the console I can see this error:
GET https://XXX.s3.amazonaws.com/variants/Q7MZrLyoKKmQFFwMMw9tQhPW/XXX 403 (Forbidden)
If I try to visit that URL directly I get an XML
<Error>
<Code>AccessDenied</Code>
<Message>Request has expired</Message>
<X-Amz-Expires>300</X-Amz-Expires>
<Expires>2018-07-24T13:48:25Z</Expires>
<ServerTime>2018-07-24T15:25:37Z</ServerTime>
<RequestId>291D41FAC6708334</RequestId>
<HostId>lEVGuwA6Hvlm/i40PeXaje9SEBYks9+uk6DvBs=</HostId>
</Error>
This is what I have in the view
<div class="cover" style="background-image: url('<%= rails_representation_path(experience.thumbnail) %>')"></div>
This is what I have in the model
def thumbnail
self.cover.variant(resize: "300x300").processed
end
In simple words, I don't want images to expire but to be always there.
Thanks
ActiveStorage does not support non-expiring link. It uses expiring links (private), and support uploading files only as private on your service.
It was a problem for me too, and did 2 patches (caution) for S3 only, one simple ~30lines that override ActiveStorage to work only with non-expiring (public) links, and another that add an acl option to has_one_attached and has_many_attached methods.
Hope it helps.
Your question doesn't say so, but it's common to use a CDN like AWS CloudFront with a Rails app. Especially on Heroku you probably want to conserve compute power.
Here is what happens in that scenario. You render a page as usual, and all the images are requested from the asset host, which is the CDN, because that's how it is configured to integrate. Its setup to fetch anything it doesn't find in cache from origin, which is your application again.
First all image requests are passed through. The ActiveStorage controller creates signed URLs for them, and the CDN passes them on, but also caches them.
Now comes the problem. The signed URL expires in 5 minutes by default, but the CDN caches usually much longer. This is because usually you use digest assets, meaning they are invalidated not by time but by name, on any change.
The solution is simple. Increase the expiry of the signed URL to be longer than the cache's TTL. Now the cache drops the cached signed URL before it becomes invalid.
Set the URL expiry using ActiveStorage::Service.url_expires_in in 5.2 or directly in Rails.application.config.active_storage.service_urls_expire_in in an initializer see this answer for details.
To set cache TTL in CloudFront: open the AWS console, pick the distribution, open the Behavior tab, scroll down to these fields:
Then optionally issue an invalidation to force re-caching of all contents.
Keep in mind there is a security trade-off. If the image contents are private, then they don't belong into a CDN most likely, and shouldn't have long lasting temp URLs either. In that case choose a solution that exempts attachments from CDN altogether. Your application will have to handle the additional load of signing all attached assets' URLs on top of rendering the relevant page.
Further keep in mind, that this isn't necessarily a good solution, but more of a workaround. With the above setup you will cache redirects, and the heavier requests will hit your storage bucket directly. The usual scenario for CDNs is large media, not lightweight redirects. You do relieve the app of handling a lot of requests though. How much that is a valid optimization should be looked into.
I had this same issue, but after I corrected the time on my computer, the problem was resolved. It was a server time difference, that the aws servers did not recognize.
#production.rb
Change
config.active_storage.service = :local
To
config.active_storage.service = :amazon
Should match aws/amazon whatever you defined it as in storage.yml

Heroku and Cloudfront cache issues

My heroku application resizes images as thumbnails: these thumbnails are supposed to be stored by Cloudfront.
Requesting the thumb to Heroku will cause a image to be generated, which takes time, and should only be done once for each image.
Our application always access these images through Cloudfront: so the images should be generated once, then they'd be stored by Cloudfront which would serve us as long as the cache would be deemed valid.
We receive mails everytime a thumb is generated. The problem is, when we try to access those thumbs, our Heroku server is asked to generate the thumbnail again: only then the thumb is properly cached, and we can freely access it without any traffic being sent to our server.
Does anyone know why such a thing would happen ?

rails controller download from aws s3

I am trying to build a really easy way for my users to download audio content from aws via my website. Here is the flow:
I give the user a download link. Ex: www.mysite.com/foobar
User clicks on the link.
In my rails controller, I create an expiring aws s3 url and automatically start downloading the audio content from that url.
User's browser should ask the user whether or not to save the file or not. In the event the user accepts to save the file, I want a callback to my rails app to log that the user actually downloaded the file.
So, from a user's perspective, I want the process to be as simple as going to a url I determine, and accepting to download the file when prompted.
In the background, I want to keep the aws s3 url hidden from the user and I want to have the flexibility to write callback logic after the user accepts the download.
What is the recommended way to achieving this?
The best way to solve this is to create an S3 URL with a very short (10 minute?) lifetime and return a redirect to the S3 URL. This does expose the S3 url to the user, but isn't a vulnerability.
If you want to hide the S3 URL, you will need to proxy the download through your servers, which is expensive and consumes a worker process for long periods of time. I do not recommend this, but it is the only way to conceal the S3 resource.
Additionally, if triggering a download vs. a view is important, you need to set the Content-Disposition header to trigger an attachment download:
Content-Disposition: attachment; filename="fname.ext"

Endpoint questions with CloudFront & S3

I am creating an iOS app with S3 currently without distributions (CloudFront) as a test before I divulge into creating a full pledged app. In the S3 Management Console, I have made my bucket in Singapore, where I live, so CloudFront isn't really needed for this demo. I have to set an endpoint like this:
[s3Client setEndpoint: [AmazonEndpoints s3Endpoint: AP_SOUTHEAST_1]];
Which points to Singapore, endpoint is the place the bucket needs to send the data off to right? (Where the user is)
So now I have two questions
If I am using CloudFront, do I need to set an endpoint? How do I even use CloudFront in iOS, I generate a signed URL then what?
If a user is using the app in a random country lets say, what endpoint, if I need to set (with CloudFront), would I set it to? Would I find their current country via the locale and find which endpoint it is closest to?
Thanks!
A set of files in CloudFront is called a "distribution." When you set up a distribution, you specify one or more "origins", which is/are the canonical source of the files you're serving to your users.
In your case, create a new distribution and specify the S3 bucket as the origin. Then in your application, you'd reference it as: http://xxxxxxx.cloudfront.net/hello.png rather than http://mybucket.s3.amazonaws.com/hello.png. Cloudfront will automatically fetch hello.png from the S3 bucket the first time someone requests it and cache it.
CloudFront automatically (and near-instantaneously) detects which edge location is closest to the user by routing them based on network latency. You don't have to do any of these calculations yourself.
I'd recommend that you read the caveats that I've listed here though before using CloudFront in your app.
I agree with #jamieb. You should create a new Cloudfront distribution and set the S3 bucket as the origin. Then, you will no longer use the s3 bucket link, you will now use the cloudfront link to view the image. Cloudfront will pull image from S3 and store it as a cache for however long you determine. For example, if the image is going to be looked at constantly by different people in the same region, you are going to want it cached in the edge location in that region, so when a new user in that region looks it up, they get the image much more quickly.

URL fingerprint caching on Amazon S3

I have a bucket on Amazon S3 where I keep files that sometimes change but I want to use maximum caching on them, so I want to use URL fingerprinting to invalidate the cache.
I use the "last modified" date of the files for the fingerprint, and the html page requesting the S3 files always knows each file's fingerprint.
Now, I realize that I could use the fingerprint in the query string, like so:
http://aws.amazon.com/bucket/myFile.jpg?v=1310476099061
but the query string is not always enough for some proxies or older browsers to invalidate the cache, and some proxies and browsers don't even cache it if it contains a query string. That's why I want to keep the fingerprint in the actual URL, like one of these:
http://aws.amazon.com/bucket/myFile-1310476099061.jpg
http://aws.amazon.com/bucket/1310476099061/myFile.jpg
http://aws.amazon.com/bucket/myFile.jpg/1310476099061
etc
Any of these URLs would be perfect for requesting the myFile.jpg, but I want it all to be remapped to the http://aws.amazon.com/bucket/myFile.jpg file. That is, I only want the URL to change so the browser will think that it is a new file and get a fresh file which it will cache for a year. When I upload a new version of that file, the fingerprint is automatically updated.
Now here is my question: Is there any way to rewrite the url so that a request for a URL likehttp://aws.amazon.com/bucket/myFile-xxxxxx.jpg will serve the http://aws.amazon.com/bucket/myFile.jpg file on Amazon S3? Or are there any other workarounds that will still keep the file cached? Thanks =)
I'm afraid you're stuck with the version in the querystring. There is no way to rewrite the urls on S3 without actually changing the filename.

Resources