How to get objects' keys in amazon s3 in Rails? - ruby-on-rails

I use S3 to store some photos to send their path through rails to mobile clients, but when I started to send the data for a specific item in show method I used this code with aws-sdk gem to check where is the photos and get their path
s3 = Aws::S3::Resource.new(
access_key_id: 'askdlkasdmkmakml',
secret_access_key: 'aklsdmkmasldkmasmdlmasdl',
region: 'ap-southeast-1'
)
images = []
s3.bucket('my-pics').objects.find_all do |object|
images << object.key if object.key.include?(self.barcode_number})
end
this code I tested it with ruby not rails and I get this as result:
photos/9000101044393/1.jpg
photos/9000101044393/2.jpg
but when I use this code inside rails I got all these stuff
[#<Aws::S3::ObjectSummary:0x007fe348713170 #bucket_name="my-pics",
#key="photos/9000101044393/1.jpg", #data=#<struct
Aws::S3::Types::Object key="photos/9000101044393/1.jpg",
last_modified=2018-03-01 05:14:57 UTC,
etag="\"ee4540acc2a5bfc948507e0927e9dd1b\"", size=140119,
storage_class="STANDARD", owner=#<struct Aws::S3::Types::Owner
display_name="mohammed.eliass",
id="06bf2e3c37f83d96de16b13fc00efc20a097988edd1d4">>, #client=#
<Aws::S3::Client>>, #<Aws::S3::ObjectSummary:0x007fe348713008
#bucket_name="my-pics", #key="photos/9000101044393/2.jpg", #data=#
<struct Aws::S3::Types::Object key="photos/9000101044393/2.jpg",
last_modified=2018-03-01 05:14:57 UTC,
etag="\"b13dc7b1a516fee5ed15bccc57e\"", size=33132,
storage_class="STANDARD", owner=#<struct Aws::S3::Types::Owner
display_name="mohammed.eliass", id="06bf2e3c37f83d96de16b13fc0adasdaddskfl88edd1d449d85b7157d95bdf334">>,
#client=#<Aws::S3::Client>>, #<Aws::S3::ObjectSummary:0x007fe348712ef0
#bucket_name="my-pics", #key="photos/9000101044430/1.jpg", #data=#
<struct Aws::S3::Types::Object key="photos/9000101044430/1.jpg",
last_modified=2018-03-01 05:14:59 UTC,
etag="\"308ab75c98257b821469b4b93e9ca8f8\"", size=141066,
storage_class="STANDARD", owner=#<struct Aws::S3::Types::Owner
display_name="mohammed.eliass"]
So any ideas how should I get rid of all this and get only the keys?

From the docs :
bucket.objects.each do |obj|
puts obj.key
end
Untested, but if you just just want the matching keys in your output :
def matching_s3_keys
s3 = Aws::S3::Resource.new(
access_key_id: 'askdlkasdmkmakml',
secret_access_key: 'aklsdmkmasldkmasmdlmasdl',
region: 'ap-southeast-1'
)
images = s3.bucket('my-pics').objects.select do |object|
# This looks a bit weird has i'm unsure if you really
# want to match a s3key with another value in your app/db.
# Just showing here a way to filter your results
#
object.key == self.barcode_number
end
images.map(&:key)
end

Try this to retrieve the object keys
s3 = AWS::S3.new
s3.buckets['bucket-name'].objects.each do |o|
puts o.key
end
https://docs.aws.amazon.com/AWSRubySDK/latest/AWS/S3/S3Object.html
In your case
images = []
s3.buckets['my-pics'].objects.each do |object|
images << object.key if object.key.include?(self.barcode_number})
end

Related

Why is AWS uploading literal file paths, instead of uploading images?

TL;DR
How do you input file paths into the AWS S3 API Ruby client, and have them interpreted as images, not string literal file paths?
More Details
I'm using the Ruby AWS S3 client to upload images programmatically. I have taken this code from their example startup code and barely modified it myself. See https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-upload-bucket-item.html
def object_uploaded?(s3_client, bucket_name, object_key)
response = s3_client.put_object(
body: "tmp/cosn_img.jpeg", # is always interpreted literally
acl: "public-read",
bucket: bucket_name,
key: object_key
)
if response.etag
return true
else
return false
end
rescue StandardError => e
puts "Error uploading object: #{e.message}"
return false
end
# Full example call:
def run_me
bucket_name = 'cosn-images'
object_key = "#{order_number}-trello-pic_#{list_config[:ac_campaign_id]}.jpeg"
region = 'us-west-2'
s3_client = Aws::S3::Client.new(region: region)
if object_uploaded?(s3_client, bucket_name, object_key)
puts "Object '#{object_key}' uploaded to bucket '#{bucket_name}'."
else
puts "Object '#{object_key}' not uploaded to bucket '#{bucket_name}'."
end
end
This works and is able to upload to AWS, but it is uploading just the file path from the body, not the actual file itself.
file path shown when you click on attachment link
As far as I can see from the Client documentation, this should work. https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Client.html#put_object-instance_method
Client docs
Also, manually uploading this file through the frontend does work just fine, so it has to be an issue in my code.
How are you supposed to let AWS know that it should interpret that file path as a file path, and not just as a string literal?
You have two issues:
You have commas at the end of your variable assignments in object_uploaded? that are impacting the way that your variables are being stored. Remove these.
You need to reference the file as a File object type, not as a file path. Like this:
image = File.open("#{Rails.root}/tmp/cosn_img.jpeg")
See full code below:
def object_uploaded?(image, s3_client, bucket_name, object_key)
response = s3_client.put_object(
body: image,
acl: "public-read",
bucket: bucket_name,
key: object_key
)
puts response
if response.etag
return true
else
return false
end
rescue StandardError => e
puts "Error uploading object: #{e.message}"
return false
end
def run_me
image = File.open("#{Rails.root}/tmp/cosn_img.jpeg")
bucket_name = 'cosn-images'
object_key = "#{order_number}-trello-pic_#{list_config[:ac_campaign_id]}.jpeg"
region = 'us-west-2'
s3_client = Aws::S3::Client.new(region: region)
if object_uploaded?(image, s3_client, bucket_name, object_key)
puts "Object '#{object_key}' uploaded to bucket '#{bucket_name}'."
else
puts "Object '#{object_key}' not uploaded to bucket '#{bucket_name}'."
end
end
Their docs seem a bit weird and not straigtforward, but it seems that you might need to pass in a file/io object, instead of the path.
The ruby docs here have an example like this:
s3_client.put_object(
:bucket_name => 'mybucket',
:key => 'some/key'
:content_length => File.size('myfile.txt')
) do |buffer|
File.open('myfile.txt') do |io|
buffer.write(io.read(length)) until io.eof?
end
end
or another option in the aws ruby sdk docs, under "Streaming a file from disk":
File.open('/source/file/path', 'rb') do |file|
s3.put_object(bucket: 'bucket-name', key: 'object-key', body: file)
end

Convert and store to S3 with REST API / InkFilepicker

I have a Rails app on heroku. From the server side (using the REST API of InkFilepicker), I would like to convert a file, save it to my S3 bucket and store the S3 url to my model.
Concretely: Given an image (https://www.filepicker.io/api/file/hFHUCB3iTxyMzseuWOgG) I want to convert it (https://www.filepicker.io/api/file/hFHUCB3iTxyMzseuWOgG/convert?w=200&h=150&fit=clip) and store the converted image to my S3 bucket.
EDIT
Here is what I did at the end:
after_save :save_thumbnail_url_to_s3
def save_thumbnail_url_to_s3
convert_options = {
fit: 'clip',
h:500,
w:500
}
file = open("#{self.url}/convert?#{convert_options.to_query}")
# Writing file into S3 bucket
amazon = AWS::S3.new(access_key_id: ENV['AWS_ACCESS_KEY_ID'], secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'])
bucket = amazon.buckets[ENV['AWS_BUCKET']]
object = bucket.objects[s3_media_path]
written_file = object.write(file, acl: :public_read) # :authenticated_read
self.update_column :thumbnail_url, written_file.public_url.to_s
end
If you are using the filepicker.io API you can convert your file with the API and then provide then use open-uri as below to create a file stream that can be sent to S3, Tempfile as below behaves like the File API in ruby
[3] pry(main)> require 'open-uri'
=> true
[4] pry(main)> file = open("https://www.filepicker.io/api/file/hFHUCB3iTxyMzseuWOgG/convert?...")
=> #
[5] pry(main)> file.class
=> Tempfile
You can simply use the aws-s3 gem : https://github.com/marcel/aws-s3
But be careful, Heroku is read only oriented, you will only be able to work on temp files.

Need to change the storage "directory" of files in an S3 Bucket (Carrierwave / Fog)

I am using Carrierwave with 3 separate models to upload photos to S3. I kept the default settings for the uploader, which was to store photos in a root S3 bucket. I then decided to store them in sub-directories according to model name like /avatars, items/, etc. based on the model they were uploaded from...
Then, I noticed that files of the same name were being overwritten and when I deleted a model record, the photo wasn't being deleted.
I've since changed the store_dir from an uploader-specific setup like this:
def store_dir
"items"
end
to a generic one which stores photo under the model ID (I use mongo FYI):
def store_dir
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
end
Here comes the problem. I am trying to move all the photos already into S3 into the proper "directory" within S3. From what I've ready, S3 doesn't have directories per se. I'm having trouble with the rake task. Since i changed the store_dir, Carrierwave is looking for all the photos previously uploaded in the wrong directory.
namespace :pics do
desc "Fix directory location of pictures on s3"
task :item_update => :environment do
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_access_key_id => 'XXXX',
:aws_secret_access_key => 'XXX'
})
directory = connection.directories.get("myapp-uploads-dev")
Recipe.all.each do |l|
if l.images.count > 0
l.items.each do |i|
if i.picture.path.to_s != ""
new_full_path = i.picture.path.to_s
filename = new_full_path.split('/')[-1].split('?')[0]
thumb_filename = "thumb_#{filename}"
original_file_path = "items/#{filename}"
puts "attempting to retrieve: #{original_file_path}"
original_thumb_file_path = "items/#{thumb_filename}"
photo = directory.files.get(original_file_path) rescue nil
if photo
puts "we found: #{original_file_path}"
photo.expires = 2.years.from_now.httpdate
photo.key = new_full_path
photo.save
thumb_photo = directory.files.get(original_thumb_file_path) rescue nil
if thumb_photo
puts "we found: #{original_thumb_file_path}"
thumb_photo.expires = 2.years.from_now.httpdate
thumb_photo.key = "/uploads/item/picture/#{i.id}/#{thumb_filename}"
thumb_photo.save
end
end
end
end
end
end
end
end
So I'm looping through all the Recipes, looking for items with photos, determining the old Carrierwave path, trying to update it with the new one based on the store_dir change. I thought if I simply updated the photo.key with the new path, it would work, but it's not.
What am I doing wrong? Is there a better way to accomplish the ask here?
Here's what I did to get this working...
namespace :pics do
desc "Fix directory location of pictures"
task :item_update => :environment do
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_access_key_id => 'XXX',
:aws_secret_access_key => 'XXX'
})
bucket = "myapp-uploads-dev"
puts "Using bucket: #{bucket}"
Recipe.all.each do |l|
if l.images.count > 0
l.items.each do |i|
if i.picture.path.to_s != ""
new_full_path = i.picture.path.to_s
filename = new_full_path.split('/')[-1].split('?')[0]
thumb_filename = "thumb_#{filename}"
original_file_path = "items/#{filename}"
original_thumb_file_path = "items/#{thumb_filename}"
puts "attempting to retrieve: #{original_file_path}"
# copy original item
begin
connection.copy_object(bucket, original_file_path, bucket, new_full_path, 'x-amz-acl' => 'public-read')
puts "we just copied: #{original_file_path}"
rescue
puts "couldn't find: #{original_file_path}"
end
# copy thumb
begin
connection.copy_object(bucket, original_thumb_file_path, bucket, "uploads/item/picture/#{i.id}/#{thumb_filename}", 'x-amz-acl' => 'public-read')
puts "we just copied: #{original_thumb_file_path}"
rescue
puts "couldn't find thumb: #{original_thumb_file_path}"
end
end
end
end
end
end
end
Perhaps not the prettiest thing in the world, but it worked.
You need to be interacting with the S3 Objects directly to move them. You'll probably want to look at copy_object and delete_object in the Fog gem, which is what CarrierWave uses to interact with S3.
https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/copy_object.rb
https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/delete_object.rb

Fog - Get size of object in S3

How can I get the size of an s3 object in fog without downloading it?
For example:
connection.get_object(dir.key, latest_backup.key).body.size
requires I download the object first.
How can I find out the size before downloading?
To find the size of the file do this:
connection.head_object(bucket_name, object_name, options = {})
And look for this in the object that comes back:
Content-Length
This will also give you other good information like the checksum in ETag and other things like that.
Reference: http://rubydoc.info/gems/fog/Fog/Storage/AWS/Real#head_object-instance_method
remote_file.content_length does the trick for me (Fog 1.21.0)
where remote_file is gotten from the remote directory, like so:
def connection
Fog::Storage.new(fog_config)
end
def directory
connection.directories.get(bucket)
end
def remote_file
remote_files = directory.files.last
end
module S3
def self.directory
connection.directories.new(key: ENV['AWS_BUCKET'])
end
def self.connection
Fog::Storage.new(
provider: 'AWS',
aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'],
region: ENV['AWS_REGION']
)
end
def self.file(key)
directory.files.head(key)
end
end
file = S3.file(key)
file.content_length

Migrating paperclip S3 images to new url/path format

Is there a recommended technique for migrating a large set of paperclip S3 images to a new :url and :path format?
The reason for this is because after upgrading to rails 3.1, new versions of thumbs are not being shown after cropping (previously cached version is shown). This is because the filename no longer changes (since asset_timestamp was removed in rails 3.1). I'm using :fingerprint in the url/path format, but this is generated from the original, which doesn't change when cropping.
I was intending to insert :updated_at in the url/path format, and update attachment.updated_at during cropping, but after implementing that change all existing images would need to be moved to their new location. That's around half a million images to rename over S3.
At this point I'm considering copying them to their new location first, then deploying the code change, then moving any images which were missed (ie uploaded after the copy), but I'm hoping there's an easier way... any suggestions?
I had to change my paperclip path in order to support image cropping, I ended up creating a rake task to help out.
namespace :paperclip_migration do
desc 'Migrate data'
task :migrate_s3 => :environment do
# Make sure that all of the models have been loaded so any attachments are registered
puts 'Loading models...'
Dir[Rails.root.join('app', 'models', '**/*')].each { |file| File.basename(file, '.rb').camelize.constantize }
# Iterate through all of the registered attachments
puts 'Migrating attachments...'
attachment_registry.each_definition do |klass, name, options|
puts "Migrating #{klass}: #{name}"
klass.find_each(batch_size: 100) do |instance|
attachment = instance.send(name)
unless attachment.blank?
attachment.styles.each do |style_name, style|
old_path = interpolator.interpolate(old_path_option, attachment, style_name)
new_path = interpolator.interpolate(new_path_option, attachment, style_name)
# puts "#{style_name}:\n\told: #{old_path}\n\tnew: #{new_path}"
s3_copy(s3_bucket, old_path, new_path)
end
end
end
end
puts 'Completed migration.'
end
#############################################################################
private
# Paperclip Configuration
def attachment_registry
Paperclip::AttachmentRegistry
end
def s3_bucket
ENV['S3_BUCKET']
end
def old_path_option
':class/:id_partition/:attachment/:hash.:extension'
end
def new_path_option
':class/:attachment/:id_partition/:style/:filename'
end
def interpolator
Paperclip::Interpolations
end
# S3
def s3
AWS::S3.new(access_key_id: ENV['S3_KEY'], secret_access_key: ENV['S3_SECRET'])
end
def s3_copy(bucket, source, destination)
source_object = s3.buckets[bucket].objects[source]
destination_object = source_object.copy_to(destination, {metadata: source_object.metadata.to_h})
destination_object.acl = source_object.acl
puts "Copied #{source}"
rescue Exception => e
puts "*Unable to copy #{source} - #{e.message}"
end
end
Didn't find a feasible method for migrating to a new url format. I ended up overriding Paperclip::Attachment#generate_fingerprint so it appends :updated_at.

Resources