How to s3 object URL that works with cloudfront? - ruby-on-rails

I'm currently storing files privately on S3. In my Rails app, in the attachment.rb model I can obtain a public URL for the private file like so:
def cdn_url ( style='original' )
attachment.s3_object(style).url_for( :read, secure: true, response_content_type: self.meta['file_content_type'], expires: 1.hour ).to_s
end
The problem is this is providing a URL to S3 and rewriting the URL to use my Cloudfront origin url is erroring with:
The request signature we calculated does not match the signature you provided. Check your key and signing method.
How can I get a public URL asset like below but serve the asset via Cloudfront?

First Way (Easy)
Just use aws_cf_signer gem. Put it in you bundler.
WIth this you can do something like
def cdn_url (options = {})
style = options[:style] || 'original'
cloudfront_domain = options[:cloudfront_domain] || 'example.cloudfront.net'
cloudfront_pem_key_path = options[:cloudfront_pem_key_path]
cloudfront_key_paid_id = options[:cloundfrount_key_paid_id]
path = attachment.path(style) #path of the file
# you can get this values from your aws a/c , most probably by going int
# https://console.aws.amazon.com/iam/home?#security_credential
signer = AwsCfSigner.new(cloudfront_pem_key_path, cloudfront_key_paid_id)
# this configuration may vary.
# visit https://github.com/dylanvaughn/aws_cf_signer
# and check all available settings/options
url = signer.sign(path, :ending => Time.now + 3600)
cloudfront_domain + url
end
With this you can access the url with something like this
cdn_url(cloudfront_pem_key_path: '/users/downloads/pri.pem' , cloudfront_key_paid_id: '33243424XXX')
Second way
# A simple function to return a signed, expiring url for Amazon Cloudfront.
# This will require openssl, digest/sha1, base64 and maybe other libraries.
module CloudFront
def get_signed_expiring_url(domain,path, expires_in, private_key_filename, key_pair_id)
# AWS works on UTC, so make sure you are not using local time
expires = (Time.now.getutc + expires_in).to_i.to_s
private_key = OpenSSL::PKey::RSA.new(File.read(private_key_filename))
# path should be your S3 path without a leading slash and without a file extension.
# e.g. files/private/52
policy = %Q[{"Statement":[{"Resource":"#{path}","Condition":{"DateLessThan":{"AWS:EpochTime":#{expires}}}}]}]
signature = Base64.strict_encode64(private_key.sign(OpenSSL::Digest::SHA1.new, policy))
# I'm not sure exactly why this is required, but it's in Amazon's perl script and seems necessary
# Different base64 implementations maybe?
signature.tr!("+=/", "-_~")
"#{domain}#{path}?Expires=#{expires}&Signature=#{signature}&Key-Pair-Id=#{key_pair_id}"
end
end
With this you can do something like
def cdn_url ( style='original',cloudfront_pem_key_path,key_pair_id)
path = attachment.path(style) #path of the file
# you can get this values from your aws a/c , most probably by going int
CloudFront.get_signed_expiring_url 'example.cloudfront.net', path, 45.seconds ,'/users/downloads/pri.pem', 'as12XXXXX')
end
Give a try, may be it will work. Be sure to properly set properly bucket access policy. check this out if you are seeing accessDenied error http://www.jppinto.com/2011/12/access-denied-to-file-amazon-s3-bucket/

Use the aws sdk gem.
See the API Documentation
Details about generating a presigned URL for an operation on the object
Provide the access-key-id and secret-access-key:-
S3 = AWS::S3.new(
:access_key_id => 'access_key_id',
:secret_access_key => 'secret_access_key')
In controller put these lines:--
bucket = S3.buckets['bucket_name']
s3_obj = bucket.objects["Path-to-file"]
return s3_obj.url_for(:read, :expires => 60*60).to_s
This link will expires in 1 Hour. After that the link will be not accessible.

Related

How to verify AWS S3 put response in Rails

I am using aws-skd-s3 gem in my Rails project.
Create S3 resoure
s3 = Aws::S3::Resource.new(access_key_id: #####,
secret_access_key: #####,
region: 'us-east-1')
Create an S3 object
path = 'sample'
key = test.csv
obj = s3.bucket(#{bucket_name}).object("#{path}" + key)
Store CSV in S3
obj.put(body: csv_response, content_type: 'text/csv')
How to verify that put method stored the csv in S3 without any issues?
Is there any status code available for put method in S3 to verify?
Two ways to go about it:
Store the result. It should be a PutObjectOutput type object. You can check out the official method documentation of the put request method.
The second way to go about it is to make a exists? call right after your put request is completed. Something like this:
s3 = Aws::S3::Resource.new(region: 'ap-southeast-1') # change to the region you use
obj = s3.bucket('bucket-name').object("path/to/object/in/bucket")
if obj.exists?
# Object was uploaded successfully!
else
# No it wasn't!
end
Hope that helps!
One way I've seen or read other people doing it is calculating a md5 hash of the original file before upload and then match that with the etag value from the response of obj.put

How to retrieve attachment url with Rails Active Storage with S3

rails version 5.2
I have a scenario where I need to access the public URL of Rails Active Storage with Amazon S3 storage to make a zip file with Sidekiq background job.
I am having difficulty getting the actual file URL. I have tried rails_blob_url but it gives me following
http://localhost:3000/rails/active_storage/blobs/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBZUk9IiwiZXhwIjpudWxsLCJwdXIiOiJibG9iX2lkIn19--9598613be650942d1ee4382a44dad679a80d2d3b/sample.pdf
How do I access the real file URL through Sidekiq?
storage.yml
test:
service: Disk
root: <%= Rails.root.join("tmp/storage") %>
local:
service: Disk
root: <%= Rails.root.join("storage") %>
development:
service: S3
access_key_id: 'xxxxx'
secret_access_key: 'xxxxx'
region: 'xxxxx'
bucket: 'xxxxx'
development.rb
config.active_storage.service = :development
I can access fine these on web interface but not within Sidekiq
Use ActiveStorage::Blob#service_url. For example, assuming a Post model with a single attached header_image:
#post.header_image.service_url
Update: Rails 6.1
Since Rails 6.1 ActiveStorage::Blob#service_url is deprecated in favor of ActiveStorage::Blob#url.
So, now
#post.header_image.url
is the way to go.
Sources:
Link to the corresponding PR.
Link to source.
If you need all your files public then you must make public your uploads:
In file config/storage.yml
amazon:
service: S3
access_key_id: zzz
secret_access_key: zzz
region: zzz
bucket: zzz
upload:
acl: "public-read"
In the code
attachment = ActiveStorage::Attachment.find(90)
attachment.blob.service_url # returns large URI
attachment.blob.service_url.sub(/\?.*/, '') # remove query params
It will return something like:
"https://foo.s3.amazonaws.com/bar/buz/2yoQMbt4NvY3gXb5x1YcHpRa"
It is public readable because of the config above.
My use case was to upload images to S3 which would have public access for ALL images in the bucket so a job could pick them up later, regardless of request origin or URL expiry. This is how I did it. (Rails 5.2.2)
First, the default for new S3 bucked is to keep everything private, so to defeat that there are 2 steps.
Add a wildcard bucket policy. In AWS S3 >> your bucket >> Permissions >> Bucket Policy
{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::your-bucket-name/*"
}
]
}
In your bucket >> Permissions >> Public Access Settings, be sure Block public and cross-account access if bucket has public policies is set to false
Now you can access anything in your S3 bucket with just the blob.key in the url. No more need for tokens with expiry.
Second, to generate that URL you can either use the solution by #Christian_Butzke: #post.header_image.service.send(:object_for, #post.header_image.key).public_url
However, know that object_for is a private method on service, and if called with public_send would give you an error. So, another alternative is to use the service_url per #George_Claghorn and just remove any params with a url&.split("?")&.first. As noted, this may fail in localhost with a host missing error.
Here is my solution or an uploadable "logo" stored on S3 and made public by default:
#/models/company.rb
has_one_attached :logo
def public_logo_url
if self.logo&.attachment
if Rails.env.development?
self.logo_url = Rails.application.routes.url_helpers.rails_blob_url(self.logo, only_path: true)
else
self.logo_url = self.logo&.service_url&.split("?")&.first
end
end
#set a default lazily
self.logo_url ||= ActionController::Base.helpers.asset_path("default_company_icon.png")
end
Enjoy ^_^
I had a few problems getting this working. Thought I'd document them for posterity.
In rails 6.0 use #post.header_image.service_url
In rails >= 6.1 use #post.header_image.url as #GeorgeClaghorn recommends.
I got this error:
error: uninitialized constant Analyzable
It's a weird bug in rails 6.0, which is fixed by placing this in config/application.rb
config.autoloader = :classic
I then see this error:
URI::InvalidURIError (bad URI(is not URI?): nil) Active Storage service_url
Fix it by simply adding this to your application_controller.rb
include ActiveStorage::SetCurrent
Now something like #post.image.blob.service_url will work as you expect =)
Using the service_url method combined with striping the params to get a public URL was good idea, thanks #genkilabs and #Aivils_Štoss!
There is however a potential scaling issue involved if you are using this method on large number of files, eg. if you are showing a list of records that have files attached. For each call to service_url you will in your logs see something like:
DEBUG -- : [8df9220c-e8c9-45b7-a1ee-b746e623ca1b] S3 Storage (1.4ms) Generated URL for file at key: ...
You can't eager load these calls either, so you can potentially have a large number of calls to S3 Storage to generate those URLs for each record you are showing.
I worked around it by creating a Presenter like this:
class FilePresenter < SimpleDelegator
def initialize(obj)
super
end
def public_url
return dev_url if Rails.env.development? || Rails.env.test? || assest_host.nil?
"#{assest_host}/#{key}"
end
private
def dev_url
Rails.application.routes.url_helpers.rails_blob_url(self, only_path: true)
end
def assest_host
#assest_host ||= ENV['ASSET_HOST']
end
end
Then I set an ENV variable ASSET_HOST with this:
https://<your_app_bucket>.s3.<your_region>.amazonaws.com
Then when I display the image or just the file link, I do this:
<%= link_to(image_tag(company.display_logo),
FilePresenter.new(company.logo).public_url, target: "_blank", rel:"noopener") %>
<a href=<%= FilePresenter.new(my_record.file).public_url %>
target="_blank" rel="noopener"><%= my_record.file.filename %></a>
Note, you still need to use display_logo for images so that it will access the variant if you are using them.
Also, this is all based on setting my AWS bucket public as per #genkilabs step #2 above, and adding the upload: acl: "public-read" setting to my 'config/storage.yml' as per #Aivils_Štoss!'s suggestion.
If anyone sees any issues or pitfalls with this approach, please let me know! This seemed to work great for me in allowing me to display a public URL but not needing to hit the S3 Storage for each record to generate that URL.
Also see public access in rails active storage. This was introduced in Rails 6.1.
Specify public: true in your app's config/storage.yml. Public services will always return a permanent URL.
Just write this if You are using minio or aws S3 to get attachment url on server.
#post.header_image&.service_url&.split("?")&.first
A bit late, but you can get the public URL also like this (assuming a Post model with a single attached header_image as in the example above):
#post.header_image.service.send(:object_for, #post.header_image.key).public_url
Update 2020-04-06
You need to make sure, that the document is saved with public ACLs (e.g. setting the default to public)
rails_blob_url is also usable. Requests will be served by rails, however, those requests will be probably quite slow, since a private URL needs to be generated on each request.
(FYI: outside the controller you can generate that URL also like this: Rails.application.routes.url_helpers.rails_blob_url(#post, only_path: true))

Trying to generate a presigned url link so an user can download an Amazon S3 object, but getting invalid request

I am currently using the Ruby aws-sdk, version 2 gem with server-side customer-provided encryption key(SSE-C). I am able to upload the object from a rails form to Amazon S3 with no issues.
def s3
Aws::S3::Object.new(
bucket_name: ENV['S3_BUCKET'],
key: 'hello',
)
end
def upload_object
customer_key = OpenSSL::Cipher::AES.new(256, :CBC).random_key
customer_key_md5 = Digest::MD5.new.digest(customer_key)
object_key = 'hello'
options = {}
options[:key] = object_key
options[:sse_customer_algorithm] = 'AES256'
options[:sse_customer_key] = customer_key
options[:sse_customer_key_md5] = customer_key_md5
options[:body] = 'hello world'
options[:bucket] = ENV['S3_BUCKET']
s3.put(options)
test_params = {
object_key: object_key,
customer_key: Base64.encode64(customer_key),
md5_key: Base64.encode64(customer_key_md5),
}
Test.create(test_params)
end
But I'm having some issues with retrieving the object and generating a signed url link for the user to download it.
def retrieve_object(customer_key, md5)
options = {}
options[:key] = 'hello
options[:sse_customer_algorithm] = 'AES256'
options[:sse_customer_key] = Base64.decode64(customer_key)
options[:sse_customer_key_md5] = Base64.decode64(md5)
options[:bucket] = ENV['S3_BUCKET']
s3.get(options)
url = s3.presigned_url(:get)
end
The link is generated, but when I clicked on it, it directs me to an amazon page saying.
<Error>
<Code>InvalidRequest</Code>
<Message>
The object was stored using a form of Server Side Encryption. The correct parameters must be provided to retrieve the object.
</Message>
<RequestId>93684EEBA062B1C2</RequestId>
<HostId>
OCnn5EG7ydfoKzsmEDMbqK5kOhLFpNXxVRdekfhOfnBc6s+jtPYFsKi8IZsEPcd9ConbYUHgwC8=
</HostId>
</Error>
The error message is not helping as I'm unsure what parameters I need to add. I think I might be missing some permissions parameters.
Get Method
http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Object.html#get-instance_method
Presigned_Url Method
http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Object.html#presigned_url-instance_method
When you generate a pre-signed GET object URL, you need to provide all of the same params that you would pass to Aws::S3::Object#get.
s3.get(sse_customer_algorithm: 'AES256', sse_customer_key: customer_key).body.read
This means you need to pass the same sse_customer_* options to #presigned_url:
url = obj.presigned_url(:get,
sse_customer_algorithm: 'AES256',
sse_customer_key: customer_key)
This will ensure that the SDK correctly signs the headers that Amazon S3 expects when you make the final GET request. The next problem is that you are now responsible for sending those values along with the GET request as headers. Amazon S3 will not accept the algorithm and key in the query string.
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Get.new(uri.request_uri, {
"x-amz-server-side-encryption-customer-algorithm" => 'AES256',
"x-amz-server-side-encryption-customer-key" => Base64.encode64(cpk),
"x-amz-server-side-encryption-customer-key-MD5" => Base64.encode64(OpenSSL::Digest::MD5.digest(cpk))
})
Please note - while testing this, I found a bug in the presigned URL implementation of the current v2.0.33 version of the aws-sdk gem. This has been fixed now and should be part of v2.0.34 once it releases.
See the following gitst for a full example that patches the bug and demonstrates:
Uploads an object using a cpk
Gets the object using the SDK
Generates a presigned GET url
Downloads the object using just Net::HTTP and the presigned URL
You can view the sample script here:
https://gist.github.com/trevorrowe/49bfb9d59f83ad450a9e
Just replace the bucket_name and object_key variables at the top of the script.

How would you parse a url in Ruby to get the main domain?

I want to be able to parse any URL with Ruby to get the main part of the domain without the www (just the example.com)
Please note there is no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list of all top-level domains and the level at which domains can be registered.
This is the reason why the Public Suffix List exists.
I'm the author of PublicSuffix, a Ruby library that decomposes a domain into the different parts.
Here's an example
require 'uri/http'
uri = URI.parse("http://toolbar.google.com")
domain = PublicSuffix.parse(uri.host)
# => "toolbar.google.com"
domain.domain
# => "google.com"
uri = URI.parse("http://www.google.co.uk")
domain = PublicSuffix.parse(uri.host)
# => "www.google.co.uk"
domain.domain
# => "google.co.uk"
This should work with pretty much any URL:
# URL always gets parsed twice
def get_host_without_www(url)
url = "http://#{url}" if URI.parse(url).scheme.nil?
host = URI.parse(url).host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
Or:
# Only parses twice if url doesn't start with a scheme
def get_host_without_www(url)
uri = URI.parse(url)
uri = URI.parse("http://#{url}") if uri.scheme.nil?
host = uri.host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
You may have to require 'uri'.
Just a short note: to overcome the second parsing of the url from Mischas second example, you could make a string comparison instead of URI.parse.
# Only parses once
def get_host_without_www(url)
url = "http://#{url}" unless url.start_with?('http')
uri = URI.parse(url)
host = uri.host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
The downside of this approach is, that it is limiting the url to http(s) based urls, which is widely the standard. But if you will use it more general (f.e. for ftp links) you have to adjust accordingly.
Addressable is probably the right answer in 2018, especially uses the PublicSuffix gem to parse domains.
However, I need to do this kind of parsing in multiple places, from various data sources, and found it a bit verbose to use repeatedly. So I created a wrapper around it, Adomain:
require 'adomain'
Adomain["https://toolbar.google.com"]
# => "toolbar.google.com"
Adomain["https://www.google.com"]
# => "google.com"
Adomain["stackoverflow.com"]
# => "stackoverflow.com"
I hope this helps others.
Here's one that works better with .co.uk and .com.fr - type domains
domain = uri.host[/[^.\s\/]+\.([a-z]{3,}|([a-z]{2}|com)\.[a-z]{2})$/]
if the URL is in format http://www.google.com, then you could do something like:
a = 'http://www.google.com'
puts a.split(/\./)[1] + '.' + a.split(/\./)[2]
Or
a =~ /http:\/\/www\.(.*?)$/
puts $1

Using Send_File to a Remote Source (Ruby on Rails)

In my app, I have a requirement that is stumping me.
I have a file stored in S3, and when a user clicks on a link in my app, I log in the DB they've clicked the link, decrease their 'download credit' allowance by one and then I want to prompt the file for download.
I don't simply want to redirect the user to the file because it's stored in S3 and I don't want them to have the link of the source file (so that I can maintain integrity and access)
It looks like send_file() wont work with a remote source file, anyone recommend a gem or suitable code which will do this?
You would need to stream the file content to the user while reading it from the S3 bucket/object.
If you use the AWS::S3 library something like this may work:
send_file_headers!( :length=>S3Object.about(<s3 object>, <s3 bucket>)["content-length"], :filename=><the filename> )
render :status => 200, :text => Proc.new { |response, output|
S3Object.stream(<s3 object>, <s3 bucket>) do |chunk|
output.write chunk
end
}
This code is mostly copied form the send_file code which by itself works only for local files or file-like objects
N.B. I would anyhow advise against serving the file from the rails process itself. If possible/acceptable for your use case I'd use an authenticated GET to serve the private data from the bucket.
Using an authenticated GET you can keep the bucket and its objects private, while allowing temporary permission to read a specific object content by crafting a URL that includes an authentication signature token. The user is simply redirected to the authenticated URL, and the token can be made valid for just a few minutes.
Using the above mentioned AWS::S3 you can obtain an authenticated GET url in this way:
time_of_exipry = Time.now + 2.minutes
S3Object.url_for(<s3 object>, <s3 bucket>,
:expires => time_of_exipry)
Full image download method using temp file (tested rails 3.2):
def download
#image = Image.find(params[:image_id])
open(#image.url) {|img|
tmpfile = Tempfile.new("download.jpg")
File.open(tmpfile.path, 'wb') do |f|
f.write img.read
end
send_file tmpfile.path, :filename => "great-image.jpg"
}
end
You can read the file from S3 and write it locally to a non-public directory, then use X-Sendfile (apache) or X-Accel-Redirect (nginx) to serve the content.
For nginx you would include something like the following in your config:
location /private {
internal;
alias /path/to/private/directory/;
}
Then in your rails controller, you do the following:
response.headers['Content-Type'] = your_content_type
response.headers['Content-Disposition'] = "attachment; filename=#{your_file_name}"
response.headers['Cache-Control'] = "private"
response.headers['X-Accel-Redirect'] = path_to_your_file
render :nothing=>true
A good writeup of the process is here

Resources