I'm working on a rake task for a Rails 4.0 application in order to copy some old image URLs from outside my website or S3 into S3. I can verify that my credentials are correct and loaded with printenv. For some reason though, I get a 403 error every time I run the task. My system time is synced with a reliable time server and the avatars directory I'm attempting to save to in S3 is set to public. Any ideas on what could be causing this? Full rake task code below:
namespace :arthackday do
desc "Grabs old profile pics from participants and saves them to S3."
task :update_participant_photos => :environment do
require 'open-uri'
require 'aws/s3'
s3_base_url = "https://s3-us-west-1.amazonaws.com/arthackday/participants/avatars/"
Participant.all.each do |participant|
if participant.photo_url?
file_name = "#{participant.name.gsub(/ /,"-")}.jpg"
File.open("#{Rails.root}/public/img/temp/#{file_name}", 'wb') do |fo|
fo.write open(participant.photo_url).read
end
s3 = AWS::S3.new(:access_key_id => ENV['AWS_ACCESS_KEY_ID'], :secret_access_key => ENV['AWS_SECRET_ACCESS_KEY'])
s3.buckets[ENV['S3_BUCKET_NAME']].objects["participants/avatars/#{file_name}"].write(:file => "#{Rails.root}/public/img/temp/#{file_name}")
participant.photo_url = s3_base_url + file_name
participant.save!
puts "New URL for #{participant.name}'s photo: #{participant.photo_url}"
else
puts "#{participant.name} does not have a photo_url"
end
end
end
end
Related
So here is the thing: currently our files, when user downloads them, have names like 897123uiojdkashdu182uiej.pdf. I need to change that to file-name.pdf.
And logically I go and change paperclip.rb config from this:
Paperclip::Attachment.default_options.update({
path: '/:hash.:extension',
hash_secret: Rails.application.secrets.secret_key_base
})
to this:
Paperclip::Attachment.default_options.update({
path: "/attachment/#{SecureRandom.urlsafe_base64(64)}/:filename",
hash_secret: Rails.application.secrets.secret_key_base
})
which works just fine, filenames are great. However, old files are now unaccessable due to the change in the path. So I came up with the following decision
First I made a rake task which will store the old paths in the database:
namespace :paperclip do
desc "Set old urls for attachments"
task :update_old_urls => :environment do
Asset.find_each do |asset|
if asset.attachment
attachment_url = asset.attachment.try!(:url)
file_url = "https:#{attachment_url}"
puts "Set old url asset attachment #{asset.id} - #{file_url}"
asset.update(old_url: file_url)
else
puts "No attachment found in asset #{asset.id}"
end
end
end
end
Now the asset.old_url stores the current url of the file. Then I go and change the config, making the file unaccessable.
Now it's time for the new rake task:
require 'uri'
require 'open-uri'
namespace :paperclip do
desc "Recreate attachments and save them to new destination"
task :move_attachments => :environment do
Asset.find_each do |asset|
unless asset.old_url.blank?
url = asset.old_url
filename = File.basename(asset.attachment.path)
file = File.new("#{Rails.root}/tmp/#{filename}", "wb")
file.write(open(url).read)
if File.exists? file
puts "Re-saving asset attachment #{asset.id} - #{filename}"
asset.attachment = file
asset.save
# if there are multiple styles, you want to recreate them :
asset.attachment.reprocess!
file.close
else
puts "Missing file attachment #{asset.id} - #{filename}"
end
File.delete(file)
end
end
end
end
But my plan didn't work at all, I didn't get access to the files, and the asset.url still isn't equal to asset.old_url.
Would appreciate help very much!
With S3, you can set the "filename upon saving" as a header. Specifically, the user will get to an url https://foo.bar.com/mangled/path/some/weird/hash/whatever?options and when the browser will offer to save, you can control the filename (not the url).
The trick to that relies on the browser reading the Content-Disposition header from the response, if it reads Content-Disposition: attachment; filename="filename.jpg" it will save (or ask the user to save as) filename.jpg, independently on the original URL.
You can force S3 to add this header by adding one more parameter to the URL or by setting a metadata on the file.
The former can be done by passing it to the url method:
has_attached_file :attachment,
s3_url_options: ->(instance) {
{response_content_disposition: "attachment; filename=\"#{instance.filename}\""}
}
Check https://github.com/thoughtbot/paperclip/blob/v6.1.0/lib/paperclip/storage/s3.rb#L221-L225 for the relevant source code.
The latter can be done in bulk via paperclip (and you should also configure it to do it on new uploads). It will also take a long time!!
Asset.find_each do |asset|
next unless asset.attachment
s3_object = asset.attachment.s3_object
s3_object.copy_to(
s3_object,
metadata_directive: 'REPLACE',
content_disposition: "attachment; filename=\"#{asset.filename}\")"
)
end
# for new uploads
has_attached_file :attachment,
s3_headers: ->(att) {
{content_disposition: "attachment; filename=\"#{att.model.filename}\""}
}
I have (what I thought was) a perfectly working Heroku-Carrierwave-AWS.
I can upload images like a charm.
I now need to send the respective images, via a JSON request, to an app. This has been working on my test server, but for some reason I'm getting the following from my Heroku Logs from my rails call:
Started POST "/downloadUserPhotos" for ?? at 2014-05-03 03:27:38 +0000
Errno::ENOENT (No such file or directory # rb_sysopen - uploads/photo/mainphoto/1/largeimage.jpg):
app/controllers/stats_controller.rb:22:in `read'
app/controllers/stats_controller.rb:22:in `downloadPhotos'
I'm pretty sure this has something to do with the following Ruby/Rails code:
def downloadPhotos
#photos = Photo.find_by_user_id(current_user.id)
#mainphoto = Base64.strict_encode64(File.read(#photos.mainphoto.current_path))
end
When I use my console on Heroku and type the following:
#photos = Photo.find(1)
It works and I get the correct record shown. When I ask for current_path for mainphoto, I get:
irb(main):002:0> #photos.mainphoto.current_path
=> "uploads/photo/mainphoto/1/largeimage.jpg"
So, it knows it exists. And it's in the right place.
Can anyone enlighten me (or point me in the right direction) as to why I can't use File.read. And, more importantly, how I get it to now read the image file and encode it??
This has perplexed me somewhat.
I've tried to use #photos.mainphoto.url, but other than giving me the whole URL, it still doesn't find the file using File.read.
My carrier wave config is:
CarrierWave.configure do |config|
config.fog_credentials = {
# Configuration for Amazon S3
:provider => 'AWS',
:aws_access_key_id => ENV['AWS_ACCESS_KEY_ID'],
:aws_secret_access_key => ENV['AWS_SECRET_ACCESS_KEY'],
:region => ENV['S3_REGION']
}
config.cache_dir = "#{Rails.root}/tmp/uploads"
config.fog_directory = ENV['S3_BUCKET_NAME']
end
And I have the following in my Uploader:
def store_dir
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.user.id}"
end
Thanks in advance.
In the line:
#mainphoto = Base64.strict_encode64(File.read(#photos.mainphoto.current_path))
Change current_path for url
I am using Carrierwave with 3 separate models to upload photos to S3. I kept the default settings for the uploader, which was to store photos in a root S3 bucket. I then decided to store them in sub-directories according to model name like /avatars, items/, etc. based on the model they were uploaded from...
Then, I noticed that files of the same name were being overwritten and when I deleted a model record, the photo wasn't being deleted.
I've since changed the store_dir from an uploader-specific setup like this:
def store_dir
"items"
end
to a generic one which stores photo under the model ID (I use mongo FYI):
def store_dir
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
end
Here comes the problem. I am trying to move all the photos already into S3 into the proper "directory" within S3. From what I've ready, S3 doesn't have directories per se. I'm having trouble with the rake task. Since i changed the store_dir, Carrierwave is looking for all the photos previously uploaded in the wrong directory.
namespace :pics do
desc "Fix directory location of pictures on s3"
task :item_update => :environment do
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_access_key_id => 'XXXX',
:aws_secret_access_key => 'XXX'
})
directory = connection.directories.get("myapp-uploads-dev")
Recipe.all.each do |l|
if l.images.count > 0
l.items.each do |i|
if i.picture.path.to_s != ""
new_full_path = i.picture.path.to_s
filename = new_full_path.split('/')[-1].split('?')[0]
thumb_filename = "thumb_#{filename}"
original_file_path = "items/#{filename}"
puts "attempting to retrieve: #{original_file_path}"
original_thumb_file_path = "items/#{thumb_filename}"
photo = directory.files.get(original_file_path) rescue nil
if photo
puts "we found: #{original_file_path}"
photo.expires = 2.years.from_now.httpdate
photo.key = new_full_path
photo.save
thumb_photo = directory.files.get(original_thumb_file_path) rescue nil
if thumb_photo
puts "we found: #{original_thumb_file_path}"
thumb_photo.expires = 2.years.from_now.httpdate
thumb_photo.key = "/uploads/item/picture/#{i.id}/#{thumb_filename}"
thumb_photo.save
end
end
end
end
end
end
end
end
So I'm looping through all the Recipes, looking for items with photos, determining the old Carrierwave path, trying to update it with the new one based on the store_dir change. I thought if I simply updated the photo.key with the new path, it would work, but it's not.
What am I doing wrong? Is there a better way to accomplish the ask here?
Here's what I did to get this working...
namespace :pics do
desc "Fix directory location of pictures"
task :item_update => :environment do
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_access_key_id => 'XXX',
:aws_secret_access_key => 'XXX'
})
bucket = "myapp-uploads-dev"
puts "Using bucket: #{bucket}"
Recipe.all.each do |l|
if l.images.count > 0
l.items.each do |i|
if i.picture.path.to_s != ""
new_full_path = i.picture.path.to_s
filename = new_full_path.split('/')[-1].split('?')[0]
thumb_filename = "thumb_#{filename}"
original_file_path = "items/#{filename}"
original_thumb_file_path = "items/#{thumb_filename}"
puts "attempting to retrieve: #{original_file_path}"
# copy original item
begin
connection.copy_object(bucket, original_file_path, bucket, new_full_path, 'x-amz-acl' => 'public-read')
puts "we just copied: #{original_file_path}"
rescue
puts "couldn't find: #{original_file_path}"
end
# copy thumb
begin
connection.copy_object(bucket, original_thumb_file_path, bucket, "uploads/item/picture/#{i.id}/#{thumb_filename}", 'x-amz-acl' => 'public-read')
puts "we just copied: #{original_thumb_file_path}"
rescue
puts "couldn't find thumb: #{original_thumb_file_path}"
end
end
end
end
end
end
end
Perhaps not the prettiest thing in the world, but it worked.
You need to be interacting with the S3 Objects directly to move them. You'll probably want to look at copy_object and delete_object in the Fog gem, which is what CarrierWave uses to interact with S3.
https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/copy_object.rb
https://github.com/fog/fog/blob/8ca8a059b2f5dd2abc232dd2d2104fe6d8c41919/lib/fog/aws/requests/storage/delete_object.rb
Is there a recommended technique for migrating a large set of paperclip S3 images to a new :url and :path format?
The reason for this is because after upgrading to rails 3.1, new versions of thumbs are not being shown after cropping (previously cached version is shown). This is because the filename no longer changes (since asset_timestamp was removed in rails 3.1). I'm using :fingerprint in the url/path format, but this is generated from the original, which doesn't change when cropping.
I was intending to insert :updated_at in the url/path format, and update attachment.updated_at during cropping, but after implementing that change all existing images would need to be moved to their new location. That's around half a million images to rename over S3.
At this point I'm considering copying them to their new location first, then deploying the code change, then moving any images which were missed (ie uploaded after the copy), but I'm hoping there's an easier way... any suggestions?
I had to change my paperclip path in order to support image cropping, I ended up creating a rake task to help out.
namespace :paperclip_migration do
desc 'Migrate data'
task :migrate_s3 => :environment do
# Make sure that all of the models have been loaded so any attachments are registered
puts 'Loading models...'
Dir[Rails.root.join('app', 'models', '**/*')].each { |file| File.basename(file, '.rb').camelize.constantize }
# Iterate through all of the registered attachments
puts 'Migrating attachments...'
attachment_registry.each_definition do |klass, name, options|
puts "Migrating #{klass}: #{name}"
klass.find_each(batch_size: 100) do |instance|
attachment = instance.send(name)
unless attachment.blank?
attachment.styles.each do |style_name, style|
old_path = interpolator.interpolate(old_path_option, attachment, style_name)
new_path = interpolator.interpolate(new_path_option, attachment, style_name)
# puts "#{style_name}:\n\told: #{old_path}\n\tnew: #{new_path}"
s3_copy(s3_bucket, old_path, new_path)
end
end
end
end
puts 'Completed migration.'
end
#############################################################################
private
# Paperclip Configuration
def attachment_registry
Paperclip::AttachmentRegistry
end
def s3_bucket
ENV['S3_BUCKET']
end
def old_path_option
':class/:id_partition/:attachment/:hash.:extension'
end
def new_path_option
':class/:attachment/:id_partition/:style/:filename'
end
def interpolator
Paperclip::Interpolations
end
# S3
def s3
AWS::S3.new(access_key_id: ENV['S3_KEY'], secret_access_key: ENV['S3_SECRET'])
end
def s3_copy(bucket, source, destination)
source_object = s3.buckets[bucket].objects[source]
destination_object = source_object.copy_to(destination, {metadata: source_object.metadata.to_h})
destination_object.acl = source_object.acl
puts "Copied #{source}"
rescue Exception => e
puts "*Unable to copy #{source} - #{e.message}"
end
end
Didn't find a feasible method for migrating to a new url format. I ended up overriding Paperclip::Attachment#generate_fingerprint so it appends :updated_at.
I am running a Rails app using Paperclip to take care of file attachments and image resizing, etc. The app is currently hosted on EngineYard cloud, and all attachments are stored in their EBS. Thinking about using S3 to handle all Paperclip attachments.
Does anyone know of a good and safe way for this migration? many thanks!
You could work up a rake task that iterates over your attachments and pushes each to S3. I used this one awhile back with attachment_fu -- wouldn't be too different. This uses the aws-s3 gem.
Basically the process is:
1. Select files from the database that need to be moved
2. Push them to S3
3. Update database to reflect that the file is no longer stored locally (this way you can do them in batches and don't need to worry about pushing the same file twice).
#attachments = Attachment.stored_locally
#attachments.each do |attachment|
base_path = RAILS_ROOT + '/public/assets/'
attachment_folder = ((attachment.respond_to?(:parent_id) && attachment.parent_id) || attachment.id).to_s
full_filename = File.join(base_path, ("%08d" % attachment_folder).scan(/..../), attachment.filename)
require 'aws/s3'
AWS::S3::Base.establish_connection!(
:access_key_id => S3_CONFIG[:access_key_id],
:secret_access_key => S3_CONFIG[:secret_access_key]
)
AWS::S3::S3Object.store(
'assets/' + attachment_folder + '/' + attachment.filename,
File.open(full_filename),
S3_CONFIG[:bucket_name],
:content_type => attachment.content_type,
:access => :private
)
if AWS::S3::Service.response.success?
# Update the database
attachment.update_attribute(:stored_on_s3, true)
# Remove the file on the local filesystem
FileUtils.rm full_filename
# Remove directory also if it is now empty
Dir.rmdir(File.dirname(full_filename)) if (Dir.entries(File.dirname(full_filename))-['.','..']).empty?
else
puts "There was a problem uploading " + full_filename
end
end
I found myself in the same situation and took bensie's code and made it work for myself - this is what I came up with:
require 'aws/s3'
# Ensure you do the following:
# export AMAZON_ACCESS_KEY_ID='your-access-key'
# export AMAZON_SECRET_ACCESS_KEY='your-secret-word-thingy'
AWS::S3::Base.establish_connection!
#failed = []
#attachments = Asset.all # Asset paperclip attachment is: has_attached_file :attachment....
#attachments.each do |asset|
begin
puts "Processing #{asset.id}"
base_path = RAILS_ROOT + '/public/'
attachment_folder = ((asset.respond_to?(:parent_id) && asset.parent_id) || asset.id).to_s
styles = asset.attachment.styles.keys
styles << :original
styles.each do |style|
full_filename = File.join(base_path, asset.attachment.url(style, false))
AWS::S3::S3Object.store(
'attachments/' + attachment_folder + '/' + style.to_s + "/" + asset.attachment_file_name,
File.open(full_filename),
"swellnet-assets",
:content_type => asset.attachment_content_type,
:access => (style == :original ? :private : :public_read)
)
if AWS::S3::Service.response.success?
puts "Stored #{asset.id}[#{style.to_s}] on S3..."
else
puts "There was a problem uploading " + full_filename
end
end
rescue
puts "Error with #{asset.id}"
#failed << asset.id
end
end
puts "Failed uploads: #{#failed.join(", ")}" unless #failed.empty?
Of course, if you have multiple models you will need to adjust as necessary...