Fog - Get size of object in S3 - ruby-on-rails

How can I get the size of an s3 object in fog without downloading it?
For example:
connection.get_object(dir.key, latest_backup.key).body.size
requires I download the object first.
How can I find out the size before downloading?

To find the size of the file do this:
connection.head_object(bucket_name, object_name, options = {})
And look for this in the object that comes back:
Content-Length
This will also give you other good information like the checksum in ETag and other things like that.
Reference: http://rubydoc.info/gems/fog/Fog/Storage/AWS/Real#head_object-instance_method

remote_file.content_length does the trick for me (Fog 1.21.0)
where remote_file is gotten from the remote directory, like so:
def connection
Fog::Storage.new(fog_config)
end
def directory
connection.directories.get(bucket)
end
def remote_file
remote_files = directory.files.last
end

module S3
def self.directory
connection.directories.new(key: ENV['AWS_BUCKET'])
end
def self.connection
Fog::Storage.new(
provider: 'AWS',
aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'],
region: ENV['AWS_REGION']
)
end
def self.file(key)
directory.files.head(key)
end
end
file = S3.file(key)
file.content_length

Related

Why is AWS uploading literal file paths, instead of uploading images?

TL;DR
How do you input file paths into the AWS S3 API Ruby client, and have them interpreted as images, not string literal file paths?
More Details
I'm using the Ruby AWS S3 client to upload images programmatically. I have taken this code from their example startup code and barely modified it myself. See https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/s3-example-upload-bucket-item.html
def object_uploaded?(s3_client, bucket_name, object_key)
response = s3_client.put_object(
body: "tmp/cosn_img.jpeg", # is always interpreted literally
acl: "public-read",
bucket: bucket_name,
key: object_key
)
if response.etag
return true
else
return false
end
rescue StandardError => e
puts "Error uploading object: #{e.message}"
return false
end
# Full example call:
def run_me
bucket_name = 'cosn-images'
object_key = "#{order_number}-trello-pic_#{list_config[:ac_campaign_id]}.jpeg"
region = 'us-west-2'
s3_client = Aws::S3::Client.new(region: region)
if object_uploaded?(s3_client, bucket_name, object_key)
puts "Object '#{object_key}' uploaded to bucket '#{bucket_name}'."
else
puts "Object '#{object_key}' not uploaded to bucket '#{bucket_name}'."
end
end
This works and is able to upload to AWS, but it is uploading just the file path from the body, not the actual file itself.
file path shown when you click on attachment link
As far as I can see from the Client documentation, this should work. https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Client.html#put_object-instance_method
Client docs
Also, manually uploading this file through the frontend does work just fine, so it has to be an issue in my code.
How are you supposed to let AWS know that it should interpret that file path as a file path, and not just as a string literal?
You have two issues:
You have commas at the end of your variable assignments in object_uploaded? that are impacting the way that your variables are being stored. Remove these.
You need to reference the file as a File object type, not as a file path. Like this:
image = File.open("#{Rails.root}/tmp/cosn_img.jpeg")
See full code below:
def object_uploaded?(image, s3_client, bucket_name, object_key)
response = s3_client.put_object(
body: image,
acl: "public-read",
bucket: bucket_name,
key: object_key
)
puts response
if response.etag
return true
else
return false
end
rescue StandardError => e
puts "Error uploading object: #{e.message}"
return false
end
def run_me
image = File.open("#{Rails.root}/tmp/cosn_img.jpeg")
bucket_name = 'cosn-images'
object_key = "#{order_number}-trello-pic_#{list_config[:ac_campaign_id]}.jpeg"
region = 'us-west-2'
s3_client = Aws::S3::Client.new(region: region)
if object_uploaded?(image, s3_client, bucket_name, object_key)
puts "Object '#{object_key}' uploaded to bucket '#{bucket_name}'."
else
puts "Object '#{object_key}' not uploaded to bucket '#{bucket_name}'."
end
end
Their docs seem a bit weird and not straigtforward, but it seems that you might need to pass in a file/io object, instead of the path.
The ruby docs here have an example like this:
s3_client.put_object(
:bucket_name => 'mybucket',
:key => 'some/key'
:content_length => File.size('myfile.txt')
) do |buffer|
File.open('myfile.txt') do |io|
buffer.write(io.read(length)) until io.eof?
end
end
or another option in the aws ruby sdk docs, under "Streaming a file from disk":
File.open('/source/file/path', 'rb') do |file|
s3.put_object(bucket: 'bucket-name', key: 'object-key', body: file)
end

Carrierwave AWS Heroku : current_path and File.read can't find the file

I have (what I thought was) a perfectly working Heroku-Carrierwave-AWS.
I can upload images like a charm.
I now need to send the respective images, via a JSON request, to an app. This has been working on my test server, but for some reason I'm getting the following from my Heroku Logs from my rails call:
Started POST "/downloadUserPhotos" for ?? at 2014-05-03 03:27:38 +0000
Errno::ENOENT (No such file or directory # rb_sysopen - uploads/photo/mainphoto/1/largeimage.jpg):
app/controllers/stats_controller.rb:22:in `read'
app/controllers/stats_controller.rb:22:in `downloadPhotos'
I'm pretty sure this has something to do with the following Ruby/Rails code:
def downloadPhotos
#photos = Photo.find_by_user_id(current_user.id)
#mainphoto = Base64.strict_encode64(File.read(#photos.mainphoto.current_path))
end
When I use my console on Heroku and type the following:
#photos = Photo.find(1)
It works and I get the correct record shown. When I ask for current_path for mainphoto, I get:
irb(main):002:0> #photos.mainphoto.current_path
=> "uploads/photo/mainphoto/1/largeimage.jpg"
So, it knows it exists. And it's in the right place.
Can anyone enlighten me (or point me in the right direction) as to why I can't use File.read. And, more importantly, how I get it to now read the image file and encode it??
This has perplexed me somewhat.
I've tried to use #photos.mainphoto.url, but other than giving me the whole URL, it still doesn't find the file using File.read.
My carrier wave config is:
CarrierWave.configure do |config|
config.fog_credentials = {
# Configuration for Amazon S3
:provider => 'AWS',
:aws_access_key_id => ENV['AWS_ACCESS_KEY_ID'],
:aws_secret_access_key => ENV['AWS_SECRET_ACCESS_KEY'],
:region => ENV['S3_REGION']
}
config.cache_dir = "#{Rails.root}/tmp/uploads"
config.fog_directory = ENV['S3_BUCKET_NAME']
end
And I have the following in my Uploader:
def store_dir
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.user.id}"
end
Thanks in advance.
In the line:
#mainphoto = Base64.strict_encode64(File.read(#photos.mainphoto.current_path))
Change current_path for url

Need to change the storage "directory" of files in an S3 Bucket (Carrierwave / Fog)

I am using Carrierwave with 3 separate models to upload photos to S3. I kept the default settings for the uploader, which was to store photos in a root S3 bucket. I then decided to store them in sub-directories according to model name like /avatars, items/, etc. based on the model they were uploaded from...
Then, I noticed that files of the same name were being overwritten and when I deleted a model record, the photo wasn't being deleted.
I've since changed the store_dir from an uploader-specific setup like this:
def store_dir
"items"
end
to a generic one which stores photo under the model ID (I use mongo FYI):
def store_dir
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
end
Here comes the problem. I am trying to move all the photos already into S3 into the proper "directory" within S3. From what I've ready, S3 doesn't have directories per se. I'm having trouble with the rake task. Since i changed the store_dir, Carrierwave is looking for all the photos previously uploaded in the wrong directory.
namespace :pics do
desc "Fix directory location of pictures on s3"
task :item_update => :environment do
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_access_key_id => 'XXXX',
:aws_secret_access_key => 'XXX'
})
directory = connection.directories.get("myapp-uploads-dev")
Recipe.all.each do |l|
if l.images.count > 0
l.items.each do |i|
if i.picture.path.to_s != ""
new_full_path = i.picture.path.to_s
filename = new_full_path.split('/')[-1].split('?')[0]
thumb_filename = "thumb_#{filename}"
original_file_path = "items/#{filename}"
puts "attempting to retrieve: #{original_file_path}"
original_thumb_file_path = "items/#{thumb_filename}"
photo = directory.files.get(original_file_path) rescue nil
if photo
puts "we found: #{original_file_path}"
photo.expires = 2.years.from_now.httpdate
photo.key = new_full_path
photo.save
thumb_photo = directory.files.get(original_thumb_file_path) rescue nil
if thumb_photo
puts "we found: #{original_thumb_file_path}"
thumb_photo.expires = 2.years.from_now.httpdate
thumb_photo.key = "/uploads/item/picture/#{i.id}/#{thumb_filename}"
thumb_photo.save
end
end
end
end
end
end
end
end
So I'm looping through all the Recipes, looking for items with photos, determining the old Carrierwave path, trying to update it with the new one based on the store_dir change. I thought if I simply updated the photo.key with the new path, it would work, but it's not.
What am I doing wrong? Is there a better way to accomplish the ask here?
Here's what I did to get this working...
namespace :pics do
desc "Fix directory location of pictures"
task :item_update => :environment do
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_access_key_id => 'XXX',
:aws_secret_access_key => 'XXX'
})
bucket = "myapp-uploads-dev"
puts "Using bucket: #{bucket}"
Recipe.all.each do |l|
if l.images.count > 0
l.items.each do |i|
if i.picture.path.to_s != ""
new_full_path = i.picture.path.to_s
filename = new_full_path.split('/')[-1].split('?')[0]
thumb_filename = "thumb_#{filename}"
original_file_path = "items/#{filename}"
original_thumb_file_path = "items/#{thumb_filename}"
puts "attempting to retrieve: #{original_file_path}"
# copy original item
begin
connection.copy_object(bucket, original_file_path, bucket, new_full_path, 'x-amz-acl' => 'public-read')
puts "we just copied: #{original_file_path}"
rescue
puts "couldn't find: #{original_file_path}"
end
# copy thumb
begin
connection.copy_object(bucket, original_thumb_file_path, bucket, "uploads/item/picture/#{i.id}/#{thumb_filename}", 'x-amz-acl' => 'public-read')
puts "we just copied: #{original_thumb_file_path}"
rescue
puts "couldn't find thumb: #{original_thumb_file_path}"
end
end
end
end
end
end
end
Perhaps not the prettiest thing in the world, but it worked.
You need to be interacting with the S3 Objects directly to move them. You'll probably want to look at copy_object and delete_object in the Fog gem, which is what CarrierWave uses to interact with S3.
https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/copy_object.rb
https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/delete_object.rb

Migrating paperclip S3 images to new url/path format

Is there a recommended technique for migrating a large set of paperclip S3 images to a new :url and :path format?
The reason for this is because after upgrading to rails 3.1, new versions of thumbs are not being shown after cropping (previously cached version is shown). This is because the filename no longer changes (since asset_timestamp was removed in rails 3.1). I'm using :fingerprint in the url/path format, but this is generated from the original, which doesn't change when cropping.
I was intending to insert :updated_at in the url/path format, and update attachment.updated_at during cropping, but after implementing that change all existing images would need to be moved to their new location. That's around half a million images to rename over S3.
At this point I'm considering copying them to their new location first, then deploying the code change, then moving any images which were missed (ie uploaded after the copy), but I'm hoping there's an easier way... any suggestions?
I had to change my paperclip path in order to support image cropping, I ended up creating a rake task to help out.
namespace :paperclip_migration do
desc 'Migrate data'
task :migrate_s3 => :environment do
# Make sure that all of the models have been loaded so any attachments are registered
puts 'Loading models...'
Dir[Rails.root.join('app', 'models', '**/*')].each { |file| File.basename(file, '.rb').camelize.constantize }
# Iterate through all of the registered attachments
puts 'Migrating attachments...'
attachment_registry.each_definition do |klass, name, options|
puts "Migrating #{klass}: #{name}"
klass.find_each(batch_size: 100) do |instance|
attachment = instance.send(name)
unless attachment.blank?
attachment.styles.each do |style_name, style|
old_path = interpolator.interpolate(old_path_option, attachment, style_name)
new_path = interpolator.interpolate(new_path_option, attachment, style_name)
# puts "#{style_name}:\n\told: #{old_path}\n\tnew: #{new_path}"
s3_copy(s3_bucket, old_path, new_path)
end
end
end
end
puts 'Completed migration.'
end
#############################################################################
private
# Paperclip Configuration
def attachment_registry
Paperclip::AttachmentRegistry
end
def s3_bucket
ENV['S3_BUCKET']
end
def old_path_option
':class/:id_partition/:attachment/:hash.:extension'
end
def new_path_option
':class/:attachment/:id_partition/:style/:filename'
end
def interpolator
Paperclip::Interpolations
end
# S3
def s3
AWS::S3.new(access_key_id: ENV['S3_KEY'], secret_access_key: ENV['S3_SECRET'])
end
def s3_copy(bucket, source, destination)
source_object = s3.buckets[bucket].objects[source]
destination_object = source_object.copy_to(destination, {metadata: source_object.metadata.to_h})
destination_object.acl = source_object.acl
puts "Copied #{source}"
rescue Exception => e
puts "*Unable to copy #{source} - #{e.message}"
end
end
Didn't find a feasible method for migrating to a new url format. I ended up overriding Paperclip::Attachment#generate_fingerprint so it appends :updated_at.

Carrierwave & Amazon S3 file downloading/uploading

I have a rails 3 app with an UploadsUploader and a Resource model on which this is mounted. I recently switched to using s3 storage and this has broken my ability to download files using the send_to method. I can enable downloading using the redirect_to method which is just forwarding the user to an authenticated s3 url. I need to authenticate file downloads and I want the url to be http://mydomainname.com/the_file_path or http://mydomainname.com/controller_action_name/id_of_resource so I am assuming I need to use send_to, but is there a way of doing that using the redirect_to method? My current code follows. Resources_controller.rb
def download
resource = Resource.find(params[:id])
if resource.shared_items.find_by_shared_with_id(current_user) or resource.user_id == current_user.id
filename = resource.upload_identifier
send_file "#{Rails.root}/my_bucket_name_here/uploads/#{filename}"
else
flash[:notice] = "You don't have permission to access this file."
redirect_to resources_path
end
end
carrierwave.rb initializer:
CarrierWave.configure do |config|
config.fog_credentials = {
:provider => 'AWS', # required
:aws_access_key_id => 'xxxx', # copied off the aws site
:aws_secret_access_key => 'xxxx', #
}
config.fog_directory = 'my_bucket_name_here' # required
config.fog_host = 'https://localhost:3000' # optional, defaults to nil
config.fog_public = false # optional, defaults to true
config.fog_attributes = {'Cache-Control'=>'max-age=315576000'} # optional, defaults to {}
end
upload_uploader.rb
class UploadUploader < CarrierWave::Uploader::Base
storage :fog
def store_dir
"uploads"
end
end
All of this throws the error:
Cannot read file
/home/tom/Documents/ruby/rails/circlshare/My_bucket_name_here/uploads/Picture0024.jpg
I have tried reading up about carrierwave, fog, send_to and all of that but everything I have tried hasn't been fruitful as yet. Uploading is working fine and I can see the files in s3 bucket. Using re_direct would be great as the file wouldn't pass through my server. Any help appreciated. Thanks.
Looks like you want to upload to S3, but have not-public URLs. Instead of downloading the file from S3 and using send_file, you can redirect the user to the S3 authenticated URL. This URL will expire and only be valid for a little while (for the user to download).
Check out this thread: http://groups.google.com/group/carrierwave/browse_thread/thread/2f727c77864ac923
Since you're already setting fog_public to false, do you get an authenticated (i.e. signed) url when calling resource.upload_url

Resources