Delete objects from Amazon S3 via Rails Aws API 2 - ruby-on-rails

I'm using the s3_direct_upload gem to store images and videos on Amazon s3. When the image or video is changed or deleted, I want to nuke the old image or video on s3 and save everyone money and space.
This solution uses the V1 Aws SDK and is no longer valid:
http://blog.littleblimp.com/post/53942611764/direct-uploads-to-s3-with-rails-paperclip-and
This solution deletes files that were initially uploaded in a batch, but does nothing for the final files post-processing:
github - waynehoover/s3_direct_upload
Here is the Aws v2 SDK doc, which seems clear enough:
http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html#delete_object-instance_method
Yet this solution:
class Image < ActiveRecord::Base
validates :name, :s3_image_file_url, :s3_image_file_path, :s3_image_file_key, presence: true
before_destroy :delete_s3_files
private
def delete_s3_files
s3 = Aws::S3::Client.new()
response = s3.delete_object(
bucket: Rails.configuration.aws[:bucket],
key: self.s3_image_file_key
)
end
end
...returns only:
=> #<struct delete_marker=nil, version_id=nil>
(And the file is still available on s3 at the original url.)
Thoughts? Hasn't everyone had to do this?

I've written code using the v2 SDK to delete objects from S3. Here is a sample from my codebase:
def delete_avatar_from_s3
Aws::S3::Client.new.delete_object(
bucket: "user-avatars",
key: "#{#id}"
)
return true
rescue => e
Rails.logger.error "Error deleting avatar of user #{#id}. Failure with S3 call. Details: #{e}; #{e.backtrace}"
return false
end
It looks similar to yours, so I don't think that this code is the issue. Have you confirmed your bucket & key names and ensured that your method is actually being called?

AWS SDK for Ruby V2
delete one or more file, just pass objects array containing keys.
def delete_from_s3
S3_BUCKET.delete_objects({
delete: {
objects: [
{key: key}
]
}
})
end

Related

How to verify AWS S3 put response in Rails

I am using aws-skd-s3 gem in my Rails project.
Create S3 resoure
s3 = Aws::S3::Resource.new(access_key_id: #####,
secret_access_key: #####,
region: 'us-east-1')
Create an S3 object
path = 'sample'
key = test.csv
obj = s3.bucket(#{bucket_name}).object("#{path}" + key)
Store CSV in S3
obj.put(body: csv_response, content_type: 'text/csv')
How to verify that put method stored the csv in S3 without any issues?
Is there any status code available for put method in S3 to verify?
Two ways to go about it:
Store the result. It should be a PutObjectOutput type object. You can check out the official method documentation of the put request method.
The second way to go about it is to make a exists? call right after your put request is completed. Something like this:
s3 = Aws::S3::Resource.new(region: 'ap-southeast-1') # change to the region you use
obj = s3.bucket('bucket-name').object("path/to/object/in/bucket")
if obj.exists?
# Object was uploaded successfully!
else
# No it wasn't!
end
Hope that helps!
One way I've seen or read other people doing it is calculating a md5 hash of the original file before upload and then match that with the etag value from the response of obj.put

How to retrieve attachment url with Rails Active Storage with S3

rails version 5.2
I have a scenario where I need to access the public URL of Rails Active Storage with Amazon S3 storage to make a zip file with Sidekiq background job.
I am having difficulty getting the actual file URL. I have tried rails_blob_url but it gives me following
http://localhost:3000/rails/active_storage/blobs/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBZUk9IiwiZXhwIjpudWxsLCJwdXIiOiJibG9iX2lkIn19--9598613be650942d1ee4382a44dad679a80d2d3b/sample.pdf
How do I access the real file URL through Sidekiq?
storage.yml
test:
service: Disk
root: <%= Rails.root.join("tmp/storage") %>
local:
service: Disk
root: <%= Rails.root.join("storage") %>
development:
service: S3
access_key_id: 'xxxxx'
secret_access_key: 'xxxxx'
region: 'xxxxx'
bucket: 'xxxxx'
development.rb
config.active_storage.service = :development
I can access fine these on web interface but not within Sidekiq
Use ActiveStorage::Blob#service_url. For example, assuming a Post model with a single attached header_image:
#post.header_image.service_url
Update: Rails 6.1
Since Rails 6.1 ActiveStorage::Blob#service_url is deprecated in favor of ActiveStorage::Blob#url.
So, now
#post.header_image.url
is the way to go.
Sources:
Link to the corresponding PR.
Link to source.
If you need all your files public then you must make public your uploads:
In file config/storage.yml
amazon:
service: S3
access_key_id: zzz
secret_access_key: zzz
region: zzz
bucket: zzz
upload:
acl: "public-read"
In the code
attachment = ActiveStorage::Attachment.find(90)
attachment.blob.service_url # returns large URI
attachment.blob.service_url.sub(/\?.*/, '') # remove query params
It will return something like:
"https://foo.s3.amazonaws.com/bar/buz/2yoQMbt4NvY3gXb5x1YcHpRa"
It is public readable because of the config above.
My use case was to upload images to S3 which would have public access for ALL images in the bucket so a job could pick them up later, regardless of request origin or URL expiry. This is how I did it. (Rails 5.2.2)
First, the default for new S3 bucked is to keep everything private, so to defeat that there are 2 steps.
Add a wildcard bucket policy. In AWS S3 >> your bucket >> Permissions >> Bucket Policy
{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::your-bucket-name/*"
}
]
}
In your bucket >> Permissions >> Public Access Settings, be sure Block public and cross-account access if bucket has public policies is set to false
Now you can access anything in your S3 bucket with just the blob.key in the url. No more need for tokens with expiry.
Second, to generate that URL you can either use the solution by #Christian_Butzke: #post.header_image.service.send(:object_for, #post.header_image.key).public_url
However, know that object_for is a private method on service, and if called with public_send would give you an error. So, another alternative is to use the service_url per #George_Claghorn and just remove any params with a url&.split("?")&.first. As noted, this may fail in localhost with a host missing error.
Here is my solution or an uploadable "logo" stored on S3 and made public by default:
#/models/company.rb
has_one_attached :logo
def public_logo_url
if self.logo&.attachment
if Rails.env.development?
self.logo_url = Rails.application.routes.url_helpers.rails_blob_url(self.logo, only_path: true)
else
self.logo_url = self.logo&.service_url&.split("?")&.first
end
end
#set a default lazily
self.logo_url ||= ActionController::Base.helpers.asset_path("default_company_icon.png")
end
Enjoy ^_^
I had a few problems getting this working. Thought I'd document them for posterity.
In rails 6.0 use #post.header_image.service_url
In rails >= 6.1 use #post.header_image.url as #GeorgeClaghorn recommends.
I got this error:
error: uninitialized constant Analyzable
It's a weird bug in rails 6.0, which is fixed by placing this in config/application.rb
config.autoloader = :classic
I then see this error:
URI::InvalidURIError (bad URI(is not URI?): nil) Active Storage service_url
Fix it by simply adding this to your application_controller.rb
include ActiveStorage::SetCurrent
Now something like #post.image.blob.service_url will work as you expect =)
Using the service_url method combined with striping the params to get a public URL was good idea, thanks #genkilabs and #Aivils_Štoss!
There is however a potential scaling issue involved if you are using this method on large number of files, eg. if you are showing a list of records that have files attached. For each call to service_url you will in your logs see something like:
DEBUG -- : [8df9220c-e8c9-45b7-a1ee-b746e623ca1b] S3 Storage (1.4ms) Generated URL for file at key: ...
You can't eager load these calls either, so you can potentially have a large number of calls to S3 Storage to generate those URLs for each record you are showing.
I worked around it by creating a Presenter like this:
class FilePresenter < SimpleDelegator
def initialize(obj)
super
end
def public_url
return dev_url if Rails.env.development? || Rails.env.test? || assest_host.nil?
"#{assest_host}/#{key}"
end
private
def dev_url
Rails.application.routes.url_helpers.rails_blob_url(self, only_path: true)
end
def assest_host
#assest_host ||= ENV['ASSET_HOST']
end
end
Then I set an ENV variable ASSET_HOST with this:
https://<your_app_bucket>.s3.<your_region>.amazonaws.com
Then when I display the image or just the file link, I do this:
<%= link_to(image_tag(company.display_logo),
FilePresenter.new(company.logo).public_url, target: "_blank", rel:"noopener") %>
<a href=<%= FilePresenter.new(my_record.file).public_url %>
target="_blank" rel="noopener"><%= my_record.file.filename %></a>
Note, you still need to use display_logo for images so that it will access the variant if you are using them.
Also, this is all based on setting my AWS bucket public as per #genkilabs step #2 above, and adding the upload: acl: "public-read" setting to my 'config/storage.yml' as per #Aivils_Štoss!'s suggestion.
If anyone sees any issues or pitfalls with this approach, please let me know! This seemed to work great for me in allowing me to display a public URL but not needing to hit the S3 Storage for each record to generate that URL.
Also see public access in rails active storage. This was introduced in Rails 6.1.
Specify public: true in your app's config/storage.yml. Public services will always return a permanent URL.
Just write this if You are using minio or aws S3 to get attachment url on server.
#post.header_image&.service_url&.split("?")&.first
A bit late, but you can get the public URL also like this (assuming a Post model with a single attached header_image as in the example above):
#post.header_image.service.send(:object_for, #post.header_image.key).public_url
Update 2020-04-06
You need to make sure, that the document is saved with public ACLs (e.g. setting the default to public)
rails_blob_url is also usable. Requests will be served by rails, however, those requests will be probably quite slow, since a private URL needs to be generated on each request.
(FYI: outside the controller you can generate that URL also like this: Rails.application.routes.url_helpers.rails_blob_url(#post, only_path: true))

Uploading a file to AWS S3 with ACL set to public_read

In my Rails app I save customer RMA shipping labels to an S3 bucket on creation. I just updated to V2 of the aws-sdk gem, and now my code for setting the ACL doesn't work.
Code that worked in V1.X:
# Saves label to S3 bucket
s3 = AWS::S3.new
obj = s3.buckets[ENV['S3_BUCKET_NAME']].objects["#{shippinglabel_filename}"]
obj.write(open(label.label('pdf').postage_label.label_pdf_url, 'rb'), :acl => :public_read)
.write seems to have been deprecated, so I'm using .put now. Everything is working, except when I try to set the ACL.
New code for V2.0:
# Saves label to S3 bucket
s3 = Aws::S3::Resource.new
obj = s3.bucket(ENV['S3_BUCKET_NAME']).object("#{shippinglabel_filename}")
obj.put(Base64.decode64(label_base64), { :acl => :public_read })
I get an Aws::S3::Errors::InvalidArgument error, pointed at the ACL.
This code works for me:
photo_obj = bucket.object object_name
photo_obj.upload_file path, {acl: 'public-read'}
so you need to use the string 'public-read' for the acl. I found this by seeing an example in object.rb

how to assign paperclip to file on aws using aws sdk

I have been able to have third party clients upload files directly to AWS s3 and then process those files with paperclip with the following line in the model:
my_object.file_attachment = URI.parse(URI.escape(my_bucket.s3.amazonaws.com/whatever.ext))
That line downloads the file, processes it and then saves it appropriately. The problem is, in order for that line to work, I have to provide anonymous read privileges for the upload location. So my question is: How do avoid that? My thought is to use the aws-sdk to download the file - so I have been trying stuff like:
file = Tempfile.new('temp', :encoding => 'ascii-8bit')
bucket.objects[aws_key].read do |chunk|
file.write chunk
end
my_object.file_attachment = file
and variations on that theme, but nothing is working so far. Any insights would be most helpful.
Solution I am not very happy with
You can generate a temporary privileged URL using the AWS SDK:
s3 = AWS::S3.new
bucket = s3.buckets['bucket_name']
my_object.file_attachment = bucket.objects['relative/path/of/uploaded/file.ext'].url_for(:read)
As #laertiades says in his amended question, one solution is to create a temporary, pre-signed URL using the AWS SDK.
AWS SDK version 1
In AWS SDK version 1, that looks like this:
s3 = AWS::S3.new
bucket = s3.buckets['bucket_name']
my_object.file_attachment = bucket.objects['relative/path/of/uploaded/file.ext'].url_for(:read)
AWS documentation: http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/S3/S3Object.html#url_for-instance_method
AWS SDK version 2
In AWS SDK version 2, it looks like this with the optional expires_in parameter (credit to this answer on another question):
presigner = Aws::S3::Presigner.new
my_object.file_attachment = presigner.presigned_url(:get_object, # get_object method means read-only
bucket: 'bucket-name',
key: "relative/path/of/uploaded/file.ext",
expires_in: 10.minutes.to_i # time should be in seconds
).to_s
AWS documentation: http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Presigner.html

s3 bucket..The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint

Hi I'm using below code to get the size of a bucket.Researched all over but the only way was to loop through each file.While looping through ,some buckets seems to created in a different region and I'm ending up with above error
AWS::S3::PermanentRedirect: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. from /home//.rvm/gems/ruby-1.9.2-p180/gems/aws-s3-0.6.2/lib/aws/s3/error.rb:38:in `raise'
The end point is us-west-1,
Need help in fixing the above issue also how do I switch my code dynamically to region where my bucket belongs to. Also need suggestion on adding exception in case of failure Below is my code.
Please feel free to comment.
def get_bucket
s3 = AWS::S3::Base.establish_connection!(:access_key_id => #config[:ACCESS_KEY_ID], :secret_access_key => #config[:SECRET_ACCESS_KEY])
if !s3.nil?
AWS::S3::Service.buckets.each do |bucket|
puts bucket.inspect
if !bucket.nil?
size = 0
# I'm harding coding below bucket names, for code not to fail
if ![
'cf-templates-m01ixtvp0jr0-us-west-1',
'cf-templates-m01ixtvp0jr0-us-west-2',
'elasticbeanstalk-us-west-1-767904627276',
'elasticbeanstalk-us-west-1-akiai7bucgnrthi66w6a',
'medidata-rave-cdn'
].include? bucket.name
bucket_size = AWS::S3::Bucket.find(bucket.name)
if !bucket_size.nil?
bucket_size.each do |obj|
if !obj.nil?
size += obj.size.to_i
end
end
end
end
load_bucket(bucket.name,bucket.creation_date,size,#config[:ACCOUNT_NAME])
end
end
end
end
The problem is that buckets can exist in different regions, and while you can list all buckets from the same connection (unlike other AWS entities that are locked to the location they were created in), other operations on buckets require you to log in to the specific "endpoint" (region) to which they are constrained.
My solution is to check where the bucket is located and then re-login to that region:
s3 = AWS::S3.new(#awscreds)
if s3.buckets[bucket].location_constraint != #awscreds[:region] then
# need to re-login, otherwise the S3 upload will fail
s3 = AWS::S3.new(#awscreds.merge(region: s3.buckets[bucket].location_constraint))
end
I don't understand how you're building the URL to access your bucket.
If it's in US-Standard, you can say http://s3.amazonaws.com/BUCKETNAME/path/to/file. If it's anywhere else, that doesn't work (non-coincidentally, you're limited to domain-allowed characters (lowercase and numbers only) for bucket names) and you use http://BUCKETNAME.s3.amazonaws.com/path/to/file.
This article may be of help: http://docs.aws.amazon.com/general/latest/gr/rande.html

Resources