How to retrieve attachment url with Rails Active Storage with S3 - ruby-on-rails

rails version 5.2
I have a scenario where I need to access the public URL of Rails Active Storage with Amazon S3 storage to make a zip file with Sidekiq background job.
I am having difficulty getting the actual file URL. I have tried rails_blob_url but it gives me following
http://localhost:3000/rails/active_storage/blobs/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBZUk9IiwiZXhwIjpudWxsLCJwdXIiOiJibG9iX2lkIn19--9598613be650942d1ee4382a44dad679a80d2d3b/sample.pdf
How do I access the real file URL through Sidekiq?
storage.yml
test:
service: Disk
root: <%= Rails.root.join("tmp/storage") %>
local:
service: Disk
root: <%= Rails.root.join("storage") %>
development:
service: S3
access_key_id: 'xxxxx'
secret_access_key: 'xxxxx'
region: 'xxxxx'
bucket: 'xxxxx'
development.rb
config.active_storage.service = :development
I can access fine these on web interface but not within Sidekiq

Use ActiveStorage::Blob#service_url. For example, assuming a Post model with a single attached header_image:
#post.header_image.service_url
Update: Rails 6.1
Since Rails 6.1 ActiveStorage::Blob#service_url is deprecated in favor of ActiveStorage::Blob#url.
So, now
#post.header_image.url
is the way to go.
Sources:
Link to the corresponding PR.
Link to source.

If you need all your files public then you must make public your uploads:
In file config/storage.yml
amazon:
service: S3
access_key_id: zzz
secret_access_key: zzz
region: zzz
bucket: zzz
upload:
acl: "public-read"
In the code
attachment = ActiveStorage::Attachment.find(90)
attachment.blob.service_url # returns large URI
attachment.blob.service_url.sub(/\?.*/, '') # remove query params
It will return something like:
"https://foo.s3.amazonaws.com/bar/buz/2yoQMbt4NvY3gXb5x1YcHpRa"
It is public readable because of the config above.

My use case was to upload images to S3 which would have public access for ALL images in the bucket so a job could pick them up later, regardless of request origin or URL expiry. This is how I did it. (Rails 5.2.2)
First, the default for new S3 bucked is to keep everything private, so to defeat that there are 2 steps.
Add a wildcard bucket policy. In AWS S3 >> your bucket >> Permissions >> Bucket Policy
{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::your-bucket-name/*"
}
]
}
In your bucket >> Permissions >> Public Access Settings, be sure Block public and cross-account access if bucket has public policies is set to false
Now you can access anything in your S3 bucket with just the blob.key in the url. No more need for tokens with expiry.
Second, to generate that URL you can either use the solution by #Christian_Butzke: #post.header_image.service.send(:object_for, #post.header_image.key).public_url
However, know that object_for is a private method on service, and if called with public_send would give you an error. So, another alternative is to use the service_url per #George_Claghorn and just remove any params with a url&.split("?")&.first. As noted, this may fail in localhost with a host missing error.
Here is my solution or an uploadable "logo" stored on S3 and made public by default:
#/models/company.rb
has_one_attached :logo
def public_logo_url
if self.logo&.attachment
if Rails.env.development?
self.logo_url = Rails.application.routes.url_helpers.rails_blob_url(self.logo, only_path: true)
else
self.logo_url = self.logo&.service_url&.split("?")&.first
end
end
#set a default lazily
self.logo_url ||= ActionController::Base.helpers.asset_path("default_company_icon.png")
end
Enjoy ^_^

I had a few problems getting this working. Thought I'd document them for posterity.
In rails 6.0 use #post.header_image.service_url
In rails >= 6.1 use #post.header_image.url as #GeorgeClaghorn recommends.
I got this error:
error: uninitialized constant Analyzable
It's a weird bug in rails 6.0, which is fixed by placing this in config/application.rb
config.autoloader = :classic
I then see this error:
URI::InvalidURIError (bad URI(is not URI?): nil) Active Storage service_url
Fix it by simply adding this to your application_controller.rb
include ActiveStorage::SetCurrent
Now something like #post.image.blob.service_url will work as you expect =)

Using the service_url method combined with striping the params to get a public URL was good idea, thanks #genkilabs and #Aivils_Štoss!
There is however a potential scaling issue involved if you are using this method on large number of files, eg. if you are showing a list of records that have files attached. For each call to service_url you will in your logs see something like:
DEBUG -- : [8df9220c-e8c9-45b7-a1ee-b746e623ca1b] S3 Storage (1.4ms) Generated URL for file at key: ...
You can't eager load these calls either, so you can potentially have a large number of calls to S3 Storage to generate those URLs for each record you are showing.
I worked around it by creating a Presenter like this:
class FilePresenter < SimpleDelegator
def initialize(obj)
super
end
def public_url
return dev_url if Rails.env.development? || Rails.env.test? || assest_host.nil?
"#{assest_host}/#{key}"
end
private
def dev_url
Rails.application.routes.url_helpers.rails_blob_url(self, only_path: true)
end
def assest_host
#assest_host ||= ENV['ASSET_HOST']
end
end
Then I set an ENV variable ASSET_HOST with this:
https://<your_app_bucket>.s3.<your_region>.amazonaws.com
Then when I display the image or just the file link, I do this:
<%= link_to(image_tag(company.display_logo),
FilePresenter.new(company.logo).public_url, target: "_blank", rel:"noopener") %>
<a href=<%= FilePresenter.new(my_record.file).public_url %>
target="_blank" rel="noopener"><%= my_record.file.filename %></a>
Note, you still need to use display_logo for images so that it will access the variant if you are using them.
Also, this is all based on setting my AWS bucket public as per #genkilabs step #2 above, and adding the upload: acl: "public-read" setting to my 'config/storage.yml' as per #Aivils_Štoss!'s suggestion.
If anyone sees any issues or pitfalls with this approach, please let me know! This seemed to work great for me in allowing me to display a public URL but not needing to hit the S3 Storage for each record to generate that URL.

Also see public access in rails active storage. This was introduced in Rails 6.1.
Specify public: true in your app's config/storage.yml. Public services will always return a permanent URL.

Just write this if You are using minio or aws S3 to get attachment url on server.
#post.header_image&.service_url&.split("?")&.first

A bit late, but you can get the public URL also like this (assuming a Post model with a single attached header_image as in the example above):
#post.header_image.service.send(:object_for, #post.header_image.key).public_url
Update 2020-04-06
You need to make sure, that the document is saved with public ACLs (e.g. setting the default to public)
rails_blob_url is also usable. Requests will be served by rails, however, those requests will be probably quite slow, since a private URL needs to be generated on each request.
(FYI: outside the controller you can generate that URL also like this: Rails.application.routes.url_helpers.rails_blob_url(#post, only_path: true))

Related

How to verify AWS S3 put response in Rails

I am using aws-skd-s3 gem in my Rails project.
Create S3 resoure
s3 = Aws::S3::Resource.new(access_key_id: #####,
secret_access_key: #####,
region: 'us-east-1')
Create an S3 object
path = 'sample'
key = test.csv
obj = s3.bucket(#{bucket_name}).object("#{path}" + key)
Store CSV in S3
obj.put(body: csv_response, content_type: 'text/csv')
How to verify that put method stored the csv in S3 without any issues?
Is there any status code available for put method in S3 to verify?
Two ways to go about it:
Store the result. It should be a PutObjectOutput type object. You can check out the official method documentation of the put request method.
The second way to go about it is to make a exists? call right after your put request is completed. Something like this:
s3 = Aws::S3::Resource.new(region: 'ap-southeast-1') # change to the region you use
obj = s3.bucket('bucket-name').object("path/to/object/in/bucket")
if obj.exists?
# Object was uploaded successfully!
else
# No it wasn't!
end
Hope that helps!
One way I've seen or read other people doing it is calculating a md5 hash of the original file before upload and then match that with the etag value from the response of obj.put

How to Authenticate Google Vision/Cloud Using ENV Variable in Ruby on Rails

My app is hosted on Heroku, so I'm trying to figure out how to use the JSON Google Cloud provides (to authenticate) as an environment variable, but so far I can't get authenticated.
I've searched Google and Stack Overflow and the best leads I found were:
Google Vision API authentication on heroku
How to upload a json file with secret keys to Heroku
Both say they were able to get it to work, but they don't provide code that I've been able to get work. Can someone please help me? I know it's probably something stupid.
I'm currently just trying to test the service in my product model leveraging this sample code from Google. Mine looks like this:
def self.google_vision_labels
# Imports the Google Cloud client library
require "google/cloud/vision"
# Your Google Cloud Platform project ID
project_id = "foo"
# Instantiates a client
vision = Google::Cloud::Vision.new project: project_id
# The name of the image file to annotate
file_name = "http://images5.fanpop.com/image/photos/27800000/FOOTBALL-god-sport-27863176-2272-1704.jpg"
# Performs label detection on the image file
labels = vision.image(file_name).labels
puts "Labels:"
labels.each do |label|
puts label.description
end
end
I keep receiving this error,
RuntimeError: Could not load the default credentials. Browse to
https://developers.google.com/accounts/docs/application-default-credentials for more information
Based on what I've read, I tried placing the JSON contents in secrets.yml (I'm using the Figaro gem) and then referring to it in a Google.yml file based on the answer in this SO question.
In application.yml, I put (I overwrote some contents in this post for security):
GOOGLE_APPLICATION_CREDENTIALS: {
"type": "service_account",
"project_id": "my_project",
"private_key_id": "2662293c6fca2f0ba784dca1b900acf51c59ee73",
"private_key": "-----BEGIN PRIVATE KEY-----\n #keycontents \n-----END PRIVATE KEY-----\n",
"client_email": "foo-labels#foo.iam.gserviceaccount.com",
"client_id": "100",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url":
"https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url":
"https://www.googleapis.com/robot/v1/metadata/x509/get-product-labels%40foo.iam.gserviceaccount.com"
}
and in config/google.yml, I put:
GOOGLE_APPLICATION_CREDENTIALS = ENV["GOOGLE_APPLICATION_CREDENTIALS"]
also, tried:
GOOGLE_APPLICATION_CREDENTIALS: ENV["GOOGLE_APPLICATION_CREDENTIALS"]
I have also tried changing these variable names in both files instead of GOOGLE_APPLICATION_CREDENTIALS with GOOGLE_CLOUD_KEYFILE_JSON and VISION_KEYFILE_JSON based on this Google page.
Can someone please, please help me understand what I'm doing wrong in referencing/creating the environmental variable with the JSON credentials? Thank you!
It's really annoying that Google decides to buck defacto credential standards by storing secrets via a file instead of a series of environment variables.
That said, my solution to this problem is to create a single .env variable GOOGLE_API_CREDS.
I paste the raw JSON blob into the .env then remove all newlines. Then in the application code I use JSON.parse(ENV.fetch('GOOGLE_API_CREDS') to convert the JSON blob into a real hash:
The .env file:
GOOGLE_API_CREDS={"type": "service_account","project_id": "your_app_name", ... }
Then in the application code (Google OCR client as an example):
Google::Cloud::Vision::ImageAnnotator.new(credentials: JSON.parse(ENV.fetch('GOOGLE_API_CREDS'))
Cheers
Building on Dylan's answer, I found that I needed to use an extra line to configure the credentials as follows:
Google::Cloud::Language.configure {|gcl| gcl.credentials = JSON.parse(ENV['GOOGLE_APP_CREDS'])}
because the .new(credentials: ...) method was not working for Google::Cloud::Language
had to look in the (sparse) ruby reference section of Google Cloud Language.
And yeah... storing secrets in a file is quite annoying, indeed.
I had the same problem with Google Cloud Speech, using the "Getting Started" doc from Google.
The above answers helped a great deal, coupled with updating my Google Speech Gem to V1 (https://googleapis.dev/ruby/google-cloud-speech-v1/latest/Google/Cloud/Speech/V1/Speech/Client.html)
I simply use a StringIO object so that Psych thinks that it's an actual file that I read:
google:
service: GCS
project: ''
bucket: ''
credentials: <%= StringIO.new(ENV['GOOGLE_CREDENTIALS']) %>

How to test AWS S3 get object method ?

I am using aws sdk for ruby to retrieve an object from a bucket then read it. My code is something like:
def import_from_s3
#initiate the client
s3 = Aws::S3::Client.new({
region: region,
access_key_id: key_id,
secret_access_key: secret
})
#Get the object
resp = s3.get_object(bucket: bucket, key: key)
end
My question is how do I test this method without mocking it?
Here is the documentation on how to go about it.
Stubbing the aws client response
I used the default stub and it worked just fine.
Aws.config[:s3] = {stub_responses: {get_object: {body: StringIO.new("XYZ")}}}
You don't need to (and you shouldn't even try) to test #get_object. That is not implemented by your code and you should assume it has been tested and it works. As for you method #import_from_s3, you have two options. You either don't test it since it is just a thin wrapper around #get_object; or you can make assertions/expectations on its return value.

How to s3 object URL that works with cloudfront?

I'm currently storing files privately on S3. In my Rails app, in the attachment.rb model I can obtain a public URL for the private file like so:
def cdn_url ( style='original' )
attachment.s3_object(style).url_for( :read, secure: true, response_content_type: self.meta['file_content_type'], expires: 1.hour ).to_s
end
The problem is this is providing a URL to S3 and rewriting the URL to use my Cloudfront origin url is erroring with:
The request signature we calculated does not match the signature you provided. Check your key and signing method.
How can I get a public URL asset like below but serve the asset via Cloudfront?
First Way (Easy)
Just use aws_cf_signer gem. Put it in you bundler.
WIth this you can do something like
def cdn_url (options = {})
style = options[:style] || 'original'
cloudfront_domain = options[:cloudfront_domain] || 'example.cloudfront.net'
cloudfront_pem_key_path = options[:cloudfront_pem_key_path]
cloudfront_key_paid_id = options[:cloundfrount_key_paid_id]
path = attachment.path(style) #path of the file
# you can get this values from your aws a/c , most probably by going int
# https://console.aws.amazon.com/iam/home?#security_credential
signer = AwsCfSigner.new(cloudfront_pem_key_path, cloudfront_key_paid_id)
# this configuration may vary.
# visit https://github.com/dylanvaughn/aws_cf_signer
# and check all available settings/options
url = signer.sign(path, :ending => Time.now + 3600)
cloudfront_domain + url
end
With this you can access the url with something like this
cdn_url(cloudfront_pem_key_path: '/users/downloads/pri.pem' , cloudfront_key_paid_id: '33243424XXX')
Second way
# A simple function to return a signed, expiring url for Amazon Cloudfront.
# This will require openssl, digest/sha1, base64 and maybe other libraries.
module CloudFront
def get_signed_expiring_url(domain,path, expires_in, private_key_filename, key_pair_id)
# AWS works on UTC, so make sure you are not using local time
expires = (Time.now.getutc + expires_in).to_i.to_s
private_key = OpenSSL::PKey::RSA.new(File.read(private_key_filename))
# path should be your S3 path without a leading slash and without a file extension.
# e.g. files/private/52
policy = %Q[{"Statement":[{"Resource":"#{path}","Condition":{"DateLessThan":{"AWS:EpochTime":#{expires}}}}]}]
signature = Base64.strict_encode64(private_key.sign(OpenSSL::Digest::SHA1.new, policy))
# I'm not sure exactly why this is required, but it's in Amazon's perl script and seems necessary
# Different base64 implementations maybe?
signature.tr!("+=/", "-_~")
"#{domain}#{path}?Expires=#{expires}&Signature=#{signature}&Key-Pair-Id=#{key_pair_id}"
end
end
With this you can do something like
def cdn_url ( style='original',cloudfront_pem_key_path,key_pair_id)
path = attachment.path(style) #path of the file
# you can get this values from your aws a/c , most probably by going int
CloudFront.get_signed_expiring_url 'example.cloudfront.net', path, 45.seconds ,'/users/downloads/pri.pem', 'as12XXXXX')
end
Give a try, may be it will work. Be sure to properly set properly bucket access policy. check this out if you are seeing accessDenied error http://www.jppinto.com/2011/12/access-denied-to-file-amazon-s3-bucket/
Use the aws sdk gem.
See the API Documentation
Details about generating a presigned URL for an operation on the object
Provide the access-key-id and secret-access-key:-
S3 = AWS::S3.new(
:access_key_id => 'access_key_id',
:secret_access_key => 'secret_access_key')
In controller put these lines:--
bucket = S3.buckets['bucket_name']
s3_obj = bucket.objects["Path-to-file"]
return s3_obj.url_for(:read, :expires => 60*60).to_s
This link will expires in 1 Hour. After that the link will be not accessible.

Using Send_File to a Remote Source (Ruby on Rails)

In my app, I have a requirement that is stumping me.
I have a file stored in S3, and when a user clicks on a link in my app, I log in the DB they've clicked the link, decrease their 'download credit' allowance by one and then I want to prompt the file for download.
I don't simply want to redirect the user to the file because it's stored in S3 and I don't want them to have the link of the source file (so that I can maintain integrity and access)
It looks like send_file() wont work with a remote source file, anyone recommend a gem or suitable code which will do this?
You would need to stream the file content to the user while reading it from the S3 bucket/object.
If you use the AWS::S3 library something like this may work:
send_file_headers!( :length=>S3Object.about(<s3 object>, <s3 bucket>)["content-length"], :filename=><the filename> )
render :status => 200, :text => Proc.new { |response, output|
S3Object.stream(<s3 object>, <s3 bucket>) do |chunk|
output.write chunk
end
}
This code is mostly copied form the send_file code which by itself works only for local files or file-like objects
N.B. I would anyhow advise against serving the file from the rails process itself. If possible/acceptable for your use case I'd use an authenticated GET to serve the private data from the bucket.
Using an authenticated GET you can keep the bucket and its objects private, while allowing temporary permission to read a specific object content by crafting a URL that includes an authentication signature token. The user is simply redirected to the authenticated URL, and the token can be made valid for just a few minutes.
Using the above mentioned AWS::S3 you can obtain an authenticated GET url in this way:
time_of_exipry = Time.now + 2.minutes
S3Object.url_for(<s3 object>, <s3 bucket>,
:expires => time_of_exipry)
Full image download method using temp file (tested rails 3.2):
def download
#image = Image.find(params[:image_id])
open(#image.url) {|img|
tmpfile = Tempfile.new("download.jpg")
File.open(tmpfile.path, 'wb') do |f|
f.write img.read
end
send_file tmpfile.path, :filename => "great-image.jpg"
}
end
You can read the file from S3 and write it locally to a non-public directory, then use X-Sendfile (apache) or X-Accel-Redirect (nginx) to serve the content.
For nginx you would include something like the following in your config:
location /private {
internal;
alias /path/to/private/directory/;
}
Then in your rails controller, you do the following:
response.headers['Content-Type'] = your_content_type
response.headers['Content-Disposition'] = "attachment; filename=#{your_file_name}"
response.headers['Cache-Control'] = "private"
response.headers['X-Accel-Redirect'] = path_to_your_file
render :nothing=>true
A good writeup of the process is here

Resources