How to control the number of downloads via Amazon Cloudfront - ruby-on-rails

Is there a way to create a CloudFront signed url that limits the number of times that a file can be downloaded?
According to this post Controlling number of downloads on Amazon S3, you can get the number of file downloads via the cloudfront api (but it cant find any reference to this on the amazon site)
Has anyone managed to achieve this via CloudFront?

Yes, with CloudFront you can serve Private Content.
Basically you can protect your content in two ways:
Require that your users use special CloudFront signed URLs to access your content, not the standard CloudFront public URLs.
Require that your users access your Amazon S3 content using CloudFront URLs, not Amazon S3 URLs.
When you create signed URLs for your objects, you can specify:
An ending date and time, after which the URL is no longer valid.
(Optional) The date and time that the URL becomes valid.
(Optional) The IP address or range of addresses of the computers that can be used to access your content.

Related

Customize S3 403 message

I have a website hosted on Heroku, and using Ruby on Rails with the paperclip gem.
I am trying to prevent hotlinking to all my files in my S3 bucket, so I have everything on private and only allow user to access using an expiring URL
I want to provide a more user-friendly page when user tries to reuse an expired URL. Currently it is showing the message below:
<Error>
<Code>AccessDenied</Code>
<Message>Request has expired</Message>
<X-Amz-Expires>300</X-Amz-Expires>
<Expires>2016-04-15T19:41:33Z</Expires>
<ServerTime>2016-04-15T19:41:39Z</ServerTime>
<RequestId>D5DD935553A2CF88</RequestId>
<HostId>
55+rFtFbksDMyBWf5cWwgJ+aWvJKwe5umSXgTEWYKgfoT5QR5sbJY9fRNFIiBAqd35OR2MoiCzQ=
</HostId>
</Error>
Is there a way to customize the error page on S3?
S3 offers custom error pages through the web site endpoints -- but not the REST endpoints... but signed URLs only work on the REST endpoints, and not the web site endpoints.
So, no, there is not a way to directly solve this using only S3.
One option is to use CloudFront, which offers the ability to replace the standard error pages with a custom static page, but the error content is lost and all you have is a static page. You also have to use the CloudFront URL signing mechanism, which is different than S3 (though it also has some advantages, such as wildcard support in a signed URL).
In this answer to a question that is similar, but not a complete duplicate I demonstrated the way I've used an XSL transform to "style" the S3 error XML, by modifying the XML returned to the browser, injecting a link to the XSL stylesheet, and letting the browser do the rest of the work... see the screen shots.
I'm quite pleased with the solution, though it has what some people would consider a drawback -- it requires all of the S3 requests be served via a proxy server running HAProxy in EC2. There's a small additional cost for the EC2 instance, but no added cost for the bandwidth, since the transfer from S3 into EC2 is free, and the transfer from EC2 to the Internet is the same price as transfer from S3 to the Internet. With this setup, the S3 signed URLs still work. The additional advantages in my application us that this allows me to use my SSL certs with S3 static content (although this capability is also available through CloudFront), and the fact that the proxy's access logs are in real-time.

What are the different patterns for S3 uRLS?

In terms of s3 urls, are there really 2 kinds? And why? What are the different syntaxes?
bucket.s3.amazonaws.com/key
and
s3.amazonaws.com/bucket/key
Is this it? Why are there 2? Are there more? Are these correct?
AWS is deprecating old path style URLs:
https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/
Old vs. New S3 currently supports two different addressing models:
path-style and virtual-hosted style. Let’s take a quick look at each
one. The path-style model looks like either this (the global S3
endpoint):
https://s3.amazonaws.com/jbarr-public/images/ritchie_and_thompson_pdp11.jpeg
https://s3.amazonaws.com/jeffbarr-public/classic_amazon_door_desk.png
Or this (one of the regional S3 endpoints):
https://s3-us-east-2.amazonaws.com/jbarr-public/images/ritchie_and_thompson_pdp11.jpeg
https://s3-us-east-2.amazonaws.com/jeffbarr-public/classic_amazon_door_desk.png
In this example, jbarr-public and jeffbarr-public are bucket names;
/images/ritchie_and_thompson_pdp11.jpeg and
/classic_amazon_door_desk.png are object keys.
Even though the objects are owned by distinct AWS accounts and are in
different S3 buckets (and possibly in distinct AWS regions), both of
them are in the DNS subdomain s3.amazonaws.com. Hold that thought
while we look at the equivalent virtual-hosted style references
(although you might think of these as “new,” they have been around
since at least 2010):
https://jbarr-public.s3.amazonaws.com/images/ritchie_and_thompson_pdp11.jpeg
https://jeffbarr-public.s3.amazonaws.com/classic_amazon_door_desk.png
These URLs reference the same objects, but the objects are now in
distinct DNS subdomains (jbarr-public.s3.amazonaws.com and
jeffbarr-public.s3.amazonaws.com, respectively). The difference is
subtle, but very important. When you use a URL to reference an object,
DNS resolution is used to map the subdomain name to an IP address.
With the path-style model, the subdomain is always s3.amazonaws.com or
one of the regional endpoints; with the virtual-hosted style, the
subdomain is specific to the bucket. This additional degree of
endpoint specificity is the key that opens the door to many important
improvements to S3.
The additional functionality of providing multiple URL patterns for an object in the S3 is due to the Virtual Hosts and Website Hosting and publishing the data from the root directory. I got this info from
In the Bucket starting URL style - bucket.s3.amazonaws.com/key you can simple add the files like favicon, robots.txt etc where as in the other URL pattern - s3.amazonaws.com/bucket/key - there is no notion of root directory where you can put those files.
Content Snippet from AWS S3 Page - Virtual Hosting of Buckets :
In general, virtual hosting is the practice of serving multiple web
sites from a single web server. One way to differentiate sites is by
using the apparent host name of the request instead of just the path
name part of the URI. An ordinary Amazon S3 REST request specifies a
bucket by using the first slash-delimited component of the Request-URI
path. Alternatively, you can use Amazon S3 virtual hosting to address
a bucket in a REST API call by using the HTTP Host header. In
practice, Amazon S3 interprets Host as meaning that most buckets are
automatically accessible (for limited types of requests) at
http://bucketname.s3.amazonaws.com. Furthermore, by naming your bucket
after your registered domain name and by making that name a DNS alias
for Amazon S3, you can completely customize the URL of your Amazon S3
resources, for example, http://my.bucketname.com/.
Besides the attractiveness of customized URLs, a second benefit of
virtual hosting is the ability to publish to the "root directory" of
your bucket's virtual server. This ability can be important because
many existing applications search for files in this standard location.
For example, favicon.ico, robots.txt, crossdomain.xml are all expected
to be found at the root.

Endpoint questions with CloudFront & S3

I am creating an iOS app with S3 currently without distributions (CloudFront) as a test before I divulge into creating a full pledged app. In the S3 Management Console, I have made my bucket in Singapore, where I live, so CloudFront isn't really needed for this demo. I have to set an endpoint like this:
[s3Client setEndpoint: [AmazonEndpoints s3Endpoint: AP_SOUTHEAST_1]];
Which points to Singapore, endpoint is the place the bucket needs to send the data off to right? (Where the user is)
So now I have two questions
If I am using CloudFront, do I need to set an endpoint? How do I even use CloudFront in iOS, I generate a signed URL then what?
If a user is using the app in a random country lets say, what endpoint, if I need to set (with CloudFront), would I set it to? Would I find their current country via the locale and find which endpoint it is closest to?
Thanks!
A set of files in CloudFront is called a "distribution." When you set up a distribution, you specify one or more "origins", which is/are the canonical source of the files you're serving to your users.
In your case, create a new distribution and specify the S3 bucket as the origin. Then in your application, you'd reference it as: http://xxxxxxx.cloudfront.net/hello.png rather than http://mybucket.s3.amazonaws.com/hello.png. Cloudfront will automatically fetch hello.png from the S3 bucket the first time someone requests it and cache it.
CloudFront automatically (and near-instantaneously) detects which edge location is closest to the user by routing them based on network latency. You don't have to do any of these calculations yourself.
I'd recommend that you read the caveats that I've listed here though before using CloudFront in your app.
I agree with #jamieb. You should create a new Cloudfront distribution and set the S3 bucket as the origin. Then, you will no longer use the s3 bucket link, you will now use the cloudfront link to view the image. Cloudfront will pull image from S3 and store it as a cache for however long you determine. For example, if the image is going to be looked at constantly by different people in the same region, you are going to want it cached in the edge location in that region, so when a new user in that region looks it up, they get the image much more quickly.

URL fingerprint caching on Amazon S3

I have a bucket on Amazon S3 where I keep files that sometimes change but I want to use maximum caching on them, so I want to use URL fingerprinting to invalidate the cache.
I use the "last modified" date of the files for the fingerprint, and the html page requesting the S3 files always knows each file's fingerprint.
Now, I realize that I could use the fingerprint in the query string, like so:
http://aws.amazon.com/bucket/myFile.jpg?v=1310476099061
but the query string is not always enough for some proxies or older browsers to invalidate the cache, and some proxies and browsers don't even cache it if it contains a query string. That's why I want to keep the fingerprint in the actual URL, like one of these:
http://aws.amazon.com/bucket/myFile-1310476099061.jpg
http://aws.amazon.com/bucket/1310476099061/myFile.jpg
http://aws.amazon.com/bucket/myFile.jpg/1310476099061
etc
Any of these URLs would be perfect for requesting the myFile.jpg, but I want it all to be remapped to the http://aws.amazon.com/bucket/myFile.jpg file. That is, I only want the URL to change so the browser will think that it is a new file and get a fresh file which it will cache for a year. When I upload a new version of that file, the fingerprint is automatically updated.
Now here is my question: Is there any way to rewrite the url so that a request for a URL likehttp://aws.amazon.com/bucket/myFile-xxxxxx.jpg will serve the http://aws.amazon.com/bucket/myFile.jpg file on Amazon S3? Or are there any other workarounds that will still keep the file cached? Thanks =)
I'm afraid you're stuck with the version in the querystring. There is no way to rewrite the urls on S3 without actually changing the filename.

can Amazon be used to offload server of static files for a Ruby on Rails app, but still support the app's authentication & authorization?

Can one of the Amazon services (their S3 data service, or otherwise) be used to offload server of static files for a Ruby on Rails app, but still support the app's authentication & authorization?
That is such that when the user browser downloaded the initial HTML for one page of the Ruby on Rails application, when it went back for static content (e.g. an image or CSS file), that this request would be:
(a) routed directly to the Amazon service (no RoR cycles used to serve it, or bandwidth), BUT
(b) the browser request for this item (e.g. an image) would still have to go through an authentication/authorization layer based on the user model in the Ruby on Rails application - in other words to ensure not just anyone could get the image...
thanks
The answer is a yes with a but. You can use a feature of S3 that allows you to create links to secure S3 objects that has a small time to live, default is 5 minutes. This will work for any S3 object that is uploaded as private. This means that the browser will only have X seconds or whatever to request the file from S3. Example code from docs for the AWS gem:
S3Object.url_for('beluga_baby.jpg', 'marcel_molina')
You can also specify an expires_in or expires option per file. The bad thing is that you would need to create a helper for your stylesheet, image, and js links to create the proper S3 URLs.
I would recommend that you setup a domain name for your S3 bucket, like "examples3.amazonaws.com" and put all your standard image files and CSS there as public. Then set that as the asset host in your rails config. Then, only use the secure links for static files that really need it.

Resources