I am using aws sdk for ruby to retrieve an object from a bucket then read it. My code is something like:
def import_from_s3
#initiate the client
s3 = Aws::S3::Client.new({
region: region,
access_key_id: key_id,
secret_access_key: secret
})
#Get the object
resp = s3.get_object(bucket: bucket, key: key)
end
My question is how do I test this method without mocking it?
Here is the documentation on how to go about it.
Stubbing the aws client response
I used the default stub and it worked just fine.
Aws.config[:s3] = {stub_responses: {get_object: {body: StringIO.new("XYZ")}}}
You don't need to (and you shouldn't even try) to test #get_object. That is not implemented by your code and you should assume it has been tested and it works. As for you method #import_from_s3, you have two options. You either don't test it since it is just a thin wrapper around #get_object; or you can make assertions/expectations on its return value.
Related
I'm getting the following error when trying to upload a file to an S3 bucket:
AWS::S3::Errors::InvalidAccessKeyId: The AWS Access Key Id you provided does not exist in our records.
The file exists, the bucket exists, the bucket allows uploads, the credentials are correct, and using CyberDuck with the same credentials i can connect and upload files to that bucket just fine. Most answers around here point to the credentials being overridden by environment variables, that is not the case here, i've tried passing them directly as strings, and outputting them just to make sure, it's the right credentials.
v1
AWS.config(
:access_key_id => 'key',
:secret_access_key => 'secret'
)
s3 = AWS::S3.new
bucket = AWS::S3.new.buckets['bucket-name']
obj = bucket.objects['filename']
obj.write(file: 'path-to-file', acl:'private')
this is using the v1 version of the gem (aws-sdk-v1) but I've tried also using v3 and I get the same error.
v3
Aws.config.update({
region: 'eu-west-1',
credentials: Aws::Credentials.new('key_id', 'secret')
})
s3 = Aws::S3::Resource.new(region: 'eu-west-1')
bucket = s3.bucket('bucket-name')
obj = bucket.object('filename')
ok = obj.upload_file('path-to-file')
Note: the error is thrown on the obj.write line.
Note 2: This is a rake task from a Ruby on Rails 4 app.
Finally figured it out, the problem was that because we are using a custom endpoint the credentials were not found, I guess that works differently with custom endpoints.
Now to specify the custom endpoint you'll need to use a config option that for some reason is not documented (or at least I didn't find it anywhere), I actually had to go through paperclip's code to see how those guys were handling this.
Anyway here's how the config for v1 looks like with the added config for the endpoint:
AWS.config(
:access_key_id => 'key',
:secret_access_key => 'secret',
:s3_endpoint => 'custom.endpoint.com'
)
Hopefully that will save somebody some time.
I am using aws-skd-s3 gem in my Rails project.
Create S3 resoure
s3 = Aws::S3::Resource.new(access_key_id: #####,
secret_access_key: #####,
region: 'us-east-1')
Create an S3 object
path = 'sample'
key = test.csv
obj = s3.bucket(#{bucket_name}).object("#{path}" + key)
Store CSV in S3
obj.put(body: csv_response, content_type: 'text/csv')
How to verify that put method stored the csv in S3 without any issues?
Is there any status code available for put method in S3 to verify?
Two ways to go about it:
Store the result. It should be a PutObjectOutput type object. You can check out the official method documentation of the put request method.
The second way to go about it is to make a exists? call right after your put request is completed. Something like this:
s3 = Aws::S3::Resource.new(region: 'ap-southeast-1') # change to the region you use
obj = s3.bucket('bucket-name').object("path/to/object/in/bucket")
if obj.exists?
# Object was uploaded successfully!
else
# No it wasn't!
end
Hope that helps!
One way I've seen or read other people doing it is calculating a md5 hash of the original file before upload and then match that with the etag value from the response of obj.put
rails version 5.2
I have a scenario where I need to access the public URL of Rails Active Storage with Amazon S3 storage to make a zip file with Sidekiq background job.
I am having difficulty getting the actual file URL. I have tried rails_blob_url but it gives me following
http://localhost:3000/rails/active_storage/blobs/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBZUk9IiwiZXhwIjpudWxsLCJwdXIiOiJibG9iX2lkIn19--9598613be650942d1ee4382a44dad679a80d2d3b/sample.pdf
How do I access the real file URL through Sidekiq?
storage.yml
test:
service: Disk
root: <%= Rails.root.join("tmp/storage") %>
local:
service: Disk
root: <%= Rails.root.join("storage") %>
development:
service: S3
access_key_id: 'xxxxx'
secret_access_key: 'xxxxx'
region: 'xxxxx'
bucket: 'xxxxx'
development.rb
config.active_storage.service = :development
I can access fine these on web interface but not within Sidekiq
Use ActiveStorage::Blob#service_url. For example, assuming a Post model with a single attached header_image:
#post.header_image.service_url
Update: Rails 6.1
Since Rails 6.1 ActiveStorage::Blob#service_url is deprecated in favor of ActiveStorage::Blob#url.
So, now
#post.header_image.url
is the way to go.
Sources:
Link to the corresponding PR.
Link to source.
If you need all your files public then you must make public your uploads:
In file config/storage.yml
amazon:
service: S3
access_key_id: zzz
secret_access_key: zzz
region: zzz
bucket: zzz
upload:
acl: "public-read"
In the code
attachment = ActiveStorage::Attachment.find(90)
attachment.blob.service_url # returns large URI
attachment.blob.service_url.sub(/\?.*/, '') # remove query params
It will return something like:
"https://foo.s3.amazonaws.com/bar/buz/2yoQMbt4NvY3gXb5x1YcHpRa"
It is public readable because of the config above.
My use case was to upload images to S3 which would have public access for ALL images in the bucket so a job could pick them up later, regardless of request origin or URL expiry. This is how I did it. (Rails 5.2.2)
First, the default for new S3 bucked is to keep everything private, so to defeat that there are 2 steps.
Add a wildcard bucket policy. In AWS S3 >> your bucket >> Permissions >> Bucket Policy
{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::your-bucket-name/*"
}
]
}
In your bucket >> Permissions >> Public Access Settings, be sure Block public and cross-account access if bucket has public policies is set to false
Now you can access anything in your S3 bucket with just the blob.key in the url. No more need for tokens with expiry.
Second, to generate that URL you can either use the solution by #Christian_Butzke: #post.header_image.service.send(:object_for, #post.header_image.key).public_url
However, know that object_for is a private method on service, and if called with public_send would give you an error. So, another alternative is to use the service_url per #George_Claghorn and just remove any params with a url&.split("?")&.first. As noted, this may fail in localhost with a host missing error.
Here is my solution or an uploadable "logo" stored on S3 and made public by default:
#/models/company.rb
has_one_attached :logo
def public_logo_url
if self.logo&.attachment
if Rails.env.development?
self.logo_url = Rails.application.routes.url_helpers.rails_blob_url(self.logo, only_path: true)
else
self.logo_url = self.logo&.service_url&.split("?")&.first
end
end
#set a default lazily
self.logo_url ||= ActionController::Base.helpers.asset_path("default_company_icon.png")
end
Enjoy ^_^
I had a few problems getting this working. Thought I'd document them for posterity.
In rails 6.0 use #post.header_image.service_url
In rails >= 6.1 use #post.header_image.url as #GeorgeClaghorn recommends.
I got this error:
error: uninitialized constant Analyzable
It's a weird bug in rails 6.0, which is fixed by placing this in config/application.rb
config.autoloader = :classic
I then see this error:
URI::InvalidURIError (bad URI(is not URI?): nil) Active Storage service_url
Fix it by simply adding this to your application_controller.rb
include ActiveStorage::SetCurrent
Now something like #post.image.blob.service_url will work as you expect =)
Using the service_url method combined with striping the params to get a public URL was good idea, thanks #genkilabs and #Aivils_Štoss!
There is however a potential scaling issue involved if you are using this method on large number of files, eg. if you are showing a list of records that have files attached. For each call to service_url you will in your logs see something like:
DEBUG -- : [8df9220c-e8c9-45b7-a1ee-b746e623ca1b] S3 Storage (1.4ms) Generated URL for file at key: ...
You can't eager load these calls either, so you can potentially have a large number of calls to S3 Storage to generate those URLs for each record you are showing.
I worked around it by creating a Presenter like this:
class FilePresenter < SimpleDelegator
def initialize(obj)
super
end
def public_url
return dev_url if Rails.env.development? || Rails.env.test? || assest_host.nil?
"#{assest_host}/#{key}"
end
private
def dev_url
Rails.application.routes.url_helpers.rails_blob_url(self, only_path: true)
end
def assest_host
#assest_host ||= ENV['ASSET_HOST']
end
end
Then I set an ENV variable ASSET_HOST with this:
https://<your_app_bucket>.s3.<your_region>.amazonaws.com
Then when I display the image or just the file link, I do this:
<%= link_to(image_tag(company.display_logo),
FilePresenter.new(company.logo).public_url, target: "_blank", rel:"noopener") %>
<a href=<%= FilePresenter.new(my_record.file).public_url %>
target="_blank" rel="noopener"><%= my_record.file.filename %></a>
Note, you still need to use display_logo for images so that it will access the variant if you are using them.
Also, this is all based on setting my AWS bucket public as per #genkilabs step #2 above, and adding the upload: acl: "public-read" setting to my 'config/storage.yml' as per #Aivils_Štoss!'s suggestion.
If anyone sees any issues or pitfalls with this approach, please let me know! This seemed to work great for me in allowing me to display a public URL but not needing to hit the S3 Storage for each record to generate that URL.
Also see public access in rails active storage. This was introduced in Rails 6.1.
Specify public: true in your app's config/storage.yml. Public services will always return a permanent URL.
Just write this if You are using minio or aws S3 to get attachment url on server.
#post.header_image&.service_url&.split("?")&.first
A bit late, but you can get the public URL also like this (assuming a Post model with a single attached header_image as in the example above):
#post.header_image.service.send(:object_for, #post.header_image.key).public_url
Update 2020-04-06
You need to make sure, that the document is saved with public ACLs (e.g. setting the default to public)
rails_blob_url is also usable. Requests will be served by rails, however, those requests will be probably quite slow, since a private URL needs to be generated on each request.
(FYI: outside the controller you can generate that URL also like this: Rails.application.routes.url_helpers.rails_blob_url(#post, only_path: true))
I am trying to generate a pre-signed url on my Rails server to send to the browser, so that the browser can upload to S3.
It seems like aws-sdk-s3 is the gem to use going forward. But unfortunately, I haven't come across documentation for the gem that would provide clarity. There seem to be a few different ways of doing so, and would appreciate any guidance on the difference in the following methods -
Using Aws::S3::Presigner.new (https://github.com/aws/aws-sdk-ruby/blob/master/aws-sdk-core/lib/aws-sdk-core/s3/presigner.rb) but it doesn't seem to take in an object parameter or auth credentials.
Using Aws::S3::Resource.new, but it seems like aws-sdk-resources is not going to be maintained. (https://aws.amazon.com/blogs/developer/upgrading-from-version-2-to-version-3-of-the-aws-sdk-for-ruby-2/)
Using Aws::S3::Object.new and then calling the put method on that object.
Using AWS::SigV4 directly.
I am wondering how they differ, and the implications of choosing one over the other? Any recommendations are much appreciated, especially with aws-sdk-s3.
Thank you!
So, thanks to the tips by #strognjz above, here is what worked for me using `aws-sdk-s3'.
require 'aws-sdk-s3'
#credentials below for the IAM user I am using
s3 = Aws::S3::Client.new(
region: 'us-west-2', #or any other region
access_key_id: AWS_ACCESS_KEY_ID,
secret_access_key: AWS_SECRET_ACCESS_KEY
)
signer = Aws::S3::Presigner.new(client: s3)
url = signer.presigned_url(
:put_object,
bucket: S3_BUCKET_NAME,
key: "${filename}-#{SecureRandom.uuid}"
)
This will work using the aws-sdk-s3 gem
aws_client = Aws::S3::Client.new(
region: 'us-west-2', #or any other region
access_key_id: AWS_ACCESS_KEY_ID,
secret_access_key: AWS_SECRET_ACCESS_KEY
)
s3 = Aws::S3::Resource.new(client: aws_client)
bucket = s3.bucket('bucket-name')
obj = bucket.object("${filename}-#{SecureRandom.uuid}")
url = obj.presigned_url(:put)
additional http verbs:
obj.presigned_url(:put)
obj.presigned_url(:head)
obj.presigned_url(:delete)
Hi I'm using below code to get the size of a bucket.Researched all over but the only way was to loop through each file.While looping through ,some buckets seems to created in a different region and I'm ending up with above error
AWS::S3::PermanentRedirect: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. from /home//.rvm/gems/ruby-1.9.2-p180/gems/aws-s3-0.6.2/lib/aws/s3/error.rb:38:in `raise'
The end point is us-west-1,
Need help in fixing the above issue also how do I switch my code dynamically to region where my bucket belongs to. Also need suggestion on adding exception in case of failure Below is my code.
Please feel free to comment.
def get_bucket
s3 = AWS::S3::Base.establish_connection!(:access_key_id => #config[:ACCESS_KEY_ID], :secret_access_key => #config[:SECRET_ACCESS_KEY])
if !s3.nil?
AWS::S3::Service.buckets.each do |bucket|
puts bucket.inspect
if !bucket.nil?
size = 0
# I'm harding coding below bucket names, for code not to fail
if ![
'cf-templates-m01ixtvp0jr0-us-west-1',
'cf-templates-m01ixtvp0jr0-us-west-2',
'elasticbeanstalk-us-west-1-767904627276',
'elasticbeanstalk-us-west-1-akiai7bucgnrthi66w6a',
'medidata-rave-cdn'
].include? bucket.name
bucket_size = AWS::S3::Bucket.find(bucket.name)
if !bucket_size.nil?
bucket_size.each do |obj|
if !obj.nil?
size += obj.size.to_i
end
end
end
end
load_bucket(bucket.name,bucket.creation_date,size,#config[:ACCOUNT_NAME])
end
end
end
end
The problem is that buckets can exist in different regions, and while you can list all buckets from the same connection (unlike other AWS entities that are locked to the location they were created in), other operations on buckets require you to log in to the specific "endpoint" (region) to which they are constrained.
My solution is to check where the bucket is located and then re-login to that region:
s3 = AWS::S3.new(#awscreds)
if s3.buckets[bucket].location_constraint != #awscreds[:region] then
# need to re-login, otherwise the S3 upload will fail
s3 = AWS::S3.new(#awscreds.merge(region: s3.buckets[bucket].location_constraint))
end
I don't understand how you're building the URL to access your bucket.
If it's in US-Standard, you can say http://s3.amazonaws.com/BUCKETNAME/path/to/file. If it's anywhere else, that doesn't work (non-coincidentally, you're limited to domain-allowed characters (lowercase and numbers only) for bucket names) and you use http://BUCKETNAME.s3.amazonaws.com/path/to/file.
This article may be of help: http://docs.aws.amazon.com/general/latest/gr/rande.html