How to specify a prefix when uploading to S3 using activestorage's direct upload? - ruby-on-rails

With a standard S3 configuration:
AWS_ACCESS_KEY_ID: [AWS ID]
AWS_BUCKET: [bucket name]
AWS_REGION: [region]
AWS_SECRET_ACCESS_KEY: [secret]
I can upload a file to S3 (using direct upload) with this Rails 5.2 code (only relevant code shown):
form.file_field :my_asset, direct_upload: true
This will effectively put my asset in the root of my S3 bucket, upon submitting the form.
How can I specify a prefix (e.g. "development/", so that I can mimic a folder on S3)?

2022 update: as of Rails 6.1 (check this commit), this is actually supported:
user.avatar.attach(key: "avatars/#{user.id}.jpg", io: io, content_type: "image/jpeg", filename: "avatar.jpg")

My current workaround (at least until ActiveStorage introduces the option to pass a path for the has_one_attached and has_many_attached macros) on S3 is to implement the move_to method.
So I'm letting ActiveStorage save the image to S3 as it normally does right now (at the top of the bucket), then moving the file into a folder structure.
The move_to method basically copies the file into the folder structure you pass then deletes the file that was put at the root of the bucket. This way your file ends up where you want it.
So for instance if we were storing driver details: name and drivers_license, save them as you're already doing it so that it's at the top of the bucket.
Then implement the following (I put mine in a helper):
module DriversHelper
def restructure_attachment(driver_object, new_structure)
old_key = driver_object.image.key
begin
# Passing S3 Configs
config = YAML.load_file(Rails.root.join('config', 'storage.yml'))
s3 = Aws::S3::Resource.new(region: config['amazon']['region'],
credentials: Aws::Credentials.new(config['amazon']['access_key_id'], config['amazon']['secret_access_key']))
# Fetching the licence's Aws::S3::Object
old_obj = s3.bucket(config['amazon']['bucket']).object(old_key)
# Moving the license into the new folder structure
old_obj.move_to(bucket: config['amazon']['bucket'], key: "#{new_structure}")
update_blob_key(driver_object, new_structure)
rescue => ex
driver_helper_logger.error("Error restructuring license belonging to driver with id #{driver_object.id}: #{ex.full_message}")
end
end
private
# The new structure becomes the new ActiveStorage Blob key
def update_blob_key(driver_object, new_key)
blob = driver_object.image_attachment.blob
begin
blob.key = new_key
blob.save!
rescue => ex
driver_helper_logger.error("Error reassigning the new key to the blob object of the driver with id #{driver_object.id}: #{ex.full_message}")
end
end
def driver_helper_logger
#driver_helper_logger ||= Logger.new("#{Rails.root}/log/driver_helper.log")
end
end
It's important to update the blob key so that references to the key don't return errors.
If the key is not updated any function attempting to reference the image will look for it in it's former location (at the top of the bucket) rather than in it's new location.
I'm calling this function from my controller as soon as the file is saved (that is, in the create action) so that it looks seamless even though it isn't.
While this may not be the best way, it works for now.
FYI: Based on the example you gave, the new_structure variable would be new_structure = "development/#{driver_object.image.key}".
I hope this helps! :)

Thank you, Sonia, for your answer.
I tried your solution and it works great, but I encountered problems with overwriting attachments. I often got IntegrityError while doing it. I think, that this and checksum handling may be the reason why the Rails core team don't want to add passing pathname feature. It would require changing the entire logic of the upload method.
ActiveStorage::Attached#create_from_blob method, could also accepts an ActiveStorage::Blob object. So I tried a different approach:
Create a Blob manually with a key that represents desired file structure and uploaded attachment.
Attach created Blob with the ActiveStorage method.
In my usage, the solution was something like that:
def attach file # method for attaching in the model
blob_key = destination_pathname(file)
blob = ActiveStorage::Blob.find_by(key: blob_key.to_s)
unless blob
blob = ActiveStorage::Blob.new.tap do |blob|
blob.filename = blob_key.basename.to_s
blob.key = blob_key
blob.upload file
blob.save!
end
end
# Attach method from ActiveStorage
self.file.attach blob
end
Thanks to passing a full pathname to Blob's key I received desired file structure on a server.

Sorry, that’s not currently possible. I’d suggest creating a bucket for Active Storage to use exclusively.

The above solution will still give IntegrityError, need to use File.open(file). Thank Though for idea.
class History < ApplicationRecord
has_one_attached :gs_history_file
def attach(file) # method for attaching in the model
blob_key = destination_pathname(file)
blob = ActiveStorage::Blob.find_by(key: blob_key.to_s)
unless blob
blob = ActiveStorage::Blob.new.tap do |blob|
blob.filename = blob_key.to_s
blob.key = blob_key
#blob.byte_size = 123123
#blob.checksum = Time.new.strftime("%Y%m%d-") + Faker::Alphanumeric.alpha(6)
blob.upload File.open(file)
blob.save!
end
end
# Attach method from ActiveStorage
self.gs_history_file.attach blob
end
def destination_pathname(file)
"testing/filename-#{Time.now}.xlsx"
end
end

Related

Rails - Resave All Models for S3 Migration

rails 6.1.3.2
aws-sdk-s3 gem
I currently have a rails app in production that uses ActiveStorage to attach image data to a wrapper Image model. It's currently using the local strategy to save images to disk and I am migrating it to S3. I am not using paperclip or anything similar.
I succeeded in setting it up. Currently it is set to use local primarily and have S3 as a mirror so that I can write to two places during the migration. However the documentation says that it will only save new images to S3 upon create and update of a record. I would like to "re-save" all models in production to force the migration to happen. Does anyone know how to do this?
Looks like it was already answered!
If you happen to be stuck with only access to the Rails Console like I was, this solution worked perfectly. If you copy-paste this code into the console, it will begin to produce output of the S3 uploads. After 5k of those, I was done. An immense thank you to Tayden for the solution.
all_services = [ActiveStorage::Blob.service.primary, *ActiveStorage::Blob.service.mirrors]
# Iterate through each blob
ActiveStorage::Blob.all.each do |blob|
# Select services where file exists
services = all_services.select { |file| file.exist? blob.key }
# Skip blob if file doesn't exist anywhere
next unless services.present?
# Select services where file doesn't exist
mirrors = all_services - services
# Open the local file (if one exists)
local_file = File.open(services.find{ |service| service.is_a? ActiveStorage::Service::DiskService }.path_for blob.key) if services.select{ |service| service.is_a? ActiveStorage::Service::DiskService }.any?
# Upload local file to mirrors (if one exists)
mirrors.each do |mirror|
mirror.upload blob.key, local_file, checksum: blob.checksum
end if local_file.present?
# If no local file exists then download a remote file and upload it to the mirrors (thanks #Rystraum)
services.first.open blob.key, checksum: blob.checksum do |temp_file|
mirrors.each do |mirror|
mirror.upload blob.key, temp_file, checksum: blob.checksum
end
end unless local_file.present?

Paperclip or Google Cloud Storage issue when renaming paths

I've a Rails app with Paperclip and I use Google Cloud Storage. So far so good.
To avoid having both development and production using the same storage, I decided to change the default Paperclip path to another based based on the environment. This way every env has his own directory. Then I consistently moved the old images from the default Paperclip path to the new ones.
The problem is that now old images give a 404, whereas any new image I upload works properly. Is there any way to fix that?
Here it's the previous settings:
module MyApp
class Application < Rails::Application
config.paperclip_defaults = {
storage: :fog,
fog_public: true,
fog_directory: 'myapp-01',
fog_credentials: {
google_storage_access_key_id: ENV['GOOGLE_STORAGE_ID'],
google_storage_secret_access_key: ENV['GOOGLE_STORAGE_SECRET'],
provider: 'Google'
}
}
}
I override the default using the following settings:
path: ":rails_env/:class/:attachment/:id_partition/:style/:filename",
url: "/:rails_root/:class/:attachment/:id_partition/:style/:filename"
My guess is that it's not sufficient to update Paperclip config with the new path and move all images to the new directory. You need also to update the old records...
If you wonder, the old records point to root/images/?123456789.
Your guess is right. Changing the config is not enough, you need to move the files. This, is better left for a rake task or background job. I have some code for S3, but it should give you an idea of how to implement it for Google:
def old_key(image, file_name_field)
# Previous `:path`: '/:class/:attachment/:id/:style/:filename'
klass = self.class.to_s.pluralize.downcase
attachment = image.pluralize
"#{klass}/#{attachment}/#{id}/original/#{send(file_name_field)}"
end
def re_path(image)
file_name_field = "#{image}_file_name"
return if send(file_name_field).blank?
old_object = bucket.object(old_key(image, file_name_field))
return unless old_object.exists?
Rails.logger.warn "Re-saving image attachment #{self.class}/#{id}"
send "#{image}=", URI.parse(old_object.public_url)
save
end
I'm basically building the old path using my own interpolation, finding the object in S3 (hence key/object lingo) and re-download every image from S3. Be careful with this, since you might incur in extra cost for downloading rather than just moving, if that's someone Google allows.
Then I just called this method on every image for every object:
Object.each do { |o| o.re_path(:logo); o.re_path(:background); }

Rails 4, Fog, Amazon s3 - retrieving all the images as an array from a specific folder in a bucket.

I am using amazon s3, rails 4, and the FOG gem. I have an amazon bucket called uipstudy with 100 folders, each containing about 20 images. I use the following to get all the images in a specific folder (In my application_helper.rb which is included in the application_controller.rb).
def get_files(image_folder)
connection = Fog::Storage.new(
provider: 'AWS',
aws_access_key_id: '######',
aws_secret_access_key: '#######'
)
connection.directories.get('uipimages', prefix:image_folder).files.map do |file|
file.key
end
end
In my controller I have this....in this example I am looking in the folder "1" in the uipstudy bucket.
#Amazon solution:
#images = get_files('1')
#images.each do |image|
image = "https://s3.amazonaws.com/uipstudy/#{image}"
#image_array << image
end
The problem is that its returning the files inside the folder labelled "1" but also in 10, 11, 12,13....etc. I assumed that the prefix was an absolute but it appears not. Is there a way to enforce that the prefix gets exactly the folder specified in the prefix?
I think you should be able to make a small change in your script to get the behavior you want. Simply append a forward slash to the prefix so that it clearly shows you want things that are like a directory instead of any/all things that begin with a particular character.
So, that would get you something like:
directory = connection.directories.get('upimages', prefix: image_folder + '/')
directory.files.map do |file|
file.key
end
(I just split it into two commands to make it format/read easier)
Below is my solution using the aws-sdk gem.
initialize s3 client
s3 = AWS::S3.new
bucket = s3.buckets[ENV['AWS_BUCKET']]
regex for ipa files in _inbox folder
regex = %r{_inbox/(?:[^/]+/)*[^/]+\.ipa}i
get and process ipa files
bucket.objects.select { |o| o.key.match(regex) }.each do |ipa|

after_create file saving callback resulting in intermittent error

In my user model, I have an after_create callback that looks like this:
def set_default_profile_image
file = Tempfile.new([self.initials, ".jpg"])
file.binmode
file.write(Avatarly.generate_avatar(self.full_name, format: "jpg", size: 300))
begin
self.profile_image = File.open(file.path)
ensure
file.close
file.unlink
end
self.save
end
(self.initials is simply a utility method that returns the user's initials, so that e.g. my profile image would be "HB.jpg".)
If I call the method directly on an existing user, it works maybe 80% of the time. The other times, it gives me an error message so long I can't reproduce it here (I can't even scroll back far enough in tmux to see the start of it). The error message (or what I can see of it, anyway) comprises a list of MIME types, followed by this bit:
content type discovered from file command: application/x-empty. See documentation to allow this combination.
If I create a new user, the callback results in the same error message 100% of the time.
My method uses the Avatarly gem to generate placeholder avatars; the gem yields them in blob form, hence the creation of a Tempfile to write to.
I can't understand why the above error would occur.
Make sure that full_name has a valid return value and try moving your save call into the begin section. You may be racing against a save and the tempfile being removed/unlink.
What do you expect happens when you do this?
self.profile_image = File.open(file.path)
Without a block, this is the same as:
self.profile_image = File.new(file.path)
They both return a file object. Is profile_image in the database? I'm pretty sure it is going to be mad that you sent a File object to be persisted. If you want the data from that file in the database, do something like:
self.profile_image = File.open(file.path).read
If you want to save the tempfile's path:
self.profile_image = File.path(file.path)
If you are using the path remember that you are saving a tempfile, and the file will not last very long!
I found the solution in an issue on Paperclip's github. I don't really understand the causes very well, but it seems that this is a filesystem issue, where the Tempfile is not yet persisted to disk by the time it gets read into the model.
The solution is to do absolutely anything to the Tempfile before assigning it; file.read works just fine.
def set_default_profile_image
file = Tempfile.new([self.initials, ".jpg"])
file.binmode
file.write(Avatarly.generate_avatar(self.full_name, format: "jpg", size: 300))
file.read # <-- this fixes the issue
begin
self.profile_image = File.open(file.path)
ensure
file.close
file.unlink
end
self.save
end

Rubyzip: Export zip file directly to S3 without writing tmpfile to disk?

I have this code, which writes a zip file to disk, reads it back, uploads to s3, then deletes the file:
compressed_file = some_temp_path
Zip::ZipOutputStream.open(compressed_file) do |zos|
some_file_list.each do |file|
zos.put_next_entry(file.some_title)
zos.print IO.read(file.path)
end
end # Write zip file
s3 = Aws::S3.new(S3_KEY, S3_SECRET)
bucket = Aws::S3::Bucket.create(s3, S3_BUCKET)
bucket.put("#{BUCKET_PATH}/archive.zip", IO.read(compressed_file), {}, 'authenticated-read')
File.delete(compressed_file)
This code works already but what I want is to not create the zip file anymore, to save a few steps. I was wondering if there is a way to export the zipfile data directly to s3 without having to first create a tmpfile, read it back, then delete it?
I think I just found the answer to my question.
It's Zip::ZipOutputStream.write_buffer. I'll check this out and update this answer when I get it working.
Update
It does work. My code is like this now:
compressed_filestream = Zip::ZipOutputStream.write_buffer do |zos|
some_file_list.each do |file|
zos.put_next_entry(file.some_title)
zos.print IO.read(file.path)
end
end # Outputs zipfile as StringIO
s3 = Aws::S3.new(S3_KEY, S3_SECRET)
bucket = Aws::S3::Bucket.create(s3, S3_BUCKET)
compressed_filestream.rewind
bucket.put("#{BUCKET_PATH}/archive.zip", compressed_filestream.read, {}, 'authenticated-read')
The write_buffer returns a StringIO and needs to rewind the stream first before reading it. Now I don't need to create and delete the tmpfile.
I'm just wondering now if write_buffer would be more memory extensive or heavier than open? Or is it the other way around?

Resources