Pulling, Storing & Destroying Files from an Amazon S3 Bucket w/ Rails - ruby-on-rails

I'm attempting to pull a single PDF file from my Amazon S3 bucket. Then I want to convert it to OCR, and then clear the bucket. This is the code basic structure.
class SyncronizeBucketJob
def perform
s3_objects = get_s3_files
create_ocr_documents(s3_objects)
remove_s3_objects(s3_objects)
end
def get_s3_files
# get list of s3 files here
end
def create_ocr_documents(s3_objects)
# for each s3 object create new OcrDocument with a file from s3
end
def remove_s3_objects(s3_objects)
# physically go and delete each s3 file
end
end
After reading through the documentation many times I've gotten to here,
def perform
s3_objects = get_s3_files
create_ocr_documents(s3_objects)
remove_s3_objects(s3_objects)
#bucket = bucket
#object_key = object_key
end
def pull_s3_files
bucket = s3_resource.bucket('')
bucket.objects.select do |obj|
# return true if obj is pdf
end
def create_ocr_documents
Courts::Utils::PdfVersionConverter.new(input_path: tmp_file_path, output_path: tmp_output_file_path).call
end
def remove_s3_objects
# physically go and delete each s3 file
object = s3_resource.bucket(bucket).object(object_key)
object.delete
end
But it's still not functioning, being a novice with rails and even more with AWS I'm at a loss as to where to proceed.

Related

Looking for the best way to download a data file, create txt file from data and send to AWS S3?

The title explains it but I'm using Net::SFTP to logon and get the 2 data files from the secure FTP. I then want to create a file from the downloaded data unless it's already created and I just need to skip File.open and send to AWS s3?
Here's the code so far. I'm a junior Rails developer and this has not been refactored. It's also not quite working yet. I'm all set up for Amazon::AWS::s3 but need the methods to send.
class MissouriJob < ApplicationJob
queue_as :default
def perform(*args)
# Do something later
end
def entries(sftp)
downloads = []
date_time = Time.new.inspect
entries = sftp.dir.entries('/Distribution/dor/modlpool_daily/')
entries.each_with_index do |entry, index|
# File.open("public/missouri_data/#{entry.name}_#{date_time}.txt", "w")
downloads << sftp.download!("/Distribution/dor/modlpool_daily/#{entry.name}")
file = File.open("#{entry.name}_#{date_time}", "w+") { |f| f.write("#{downloads[index]}") }
byebug
end
end
def logon
sftp = Net::SFTP.start('moftp.mo.gov', Rails.application.credentials[:missouri_username], password: Rails.application.credentials[:missouri_password] )
entries(sftp)
end
end

How can I customize the path of a Rails 5.2 ActiveStorage attachment in Amazon S3?

When adding attachments such as
has_one_attached :resume_attachment
saved files end up in the top level of the S3 bucket. How can I add them to subdirectories? For example, my old paperclip configuration could categorize in directories by model name.
You can not. There is only one option possible, at that time, for has_one_attached, has_many_attached macros that is :dependent.
https://github.com/rails/rails/blob/master/activestorage/lib/active_storage/attached/macros.rb#L30
see (maybe the reason why you have downvotes, but it is about "direct" upload so...) : How to specify a prefix when uploading to S3 using activestorage's direct upload?.
The response is from the main maintainer of Active Storage.
Use a before_validation hook to set the desired key on S3 and the desired filename for the content disposition object properties on S3.
The key and filename properties on the attachment model make their way through to the ActiveStorage S3 gem and are converted into S3 key + content disposition object properties.
class MyCoolItem < ApplicationRecord
has_one_attached :preview_image
has_one_attached :download_asset
before_validation :set_correct_attachment_filenames
def preview_image_path
# This key has to be unique across all assets. Fingerprint it yourself.
"/previews/#{item_id}/your/unique/path/on/s3.jpg"
end
def download_asset_path
# This key has to be unique across all assets. Fingerprint it yourself.
"/downloads/#{item_id}/your/unique/path/on/s3.jpg"
end
def download_asset_filename
"my-friendly-filename-#{item_id}.jpg"
end
def set_correct_attachment_filenames
# Set the location on S3 for new uploads:
preview_image.key = preview_image_path if preview_image.new_record?
download_asset.key = download_asset_path if download_asset.new_record?
# Set the content disposition header of the object on S3:
download_asset.filename = download_asset_filename if download_asset.new_record?
end
end

Rails: generate a screenshot but how to save it to / retrive it from S3 via carrierwave/fog

I can generate a screenshot via a gem 'Gastly'
Without a model, only a request to screens controller, a screenshot will be taken and saved to a public folder.
ScreensController:
def index
id = current_user.id
first_name = current_user.first_name
target = "http://localhost:3000/users/#{id}?start_date=#{params[:start_date]}"
#screenshot = Gastly.screenshot(target)
#screenshot.selector = '.simple-calendar'
#image = #screenshot.capture
date_mark = params[:start_date].split('-')[0..1].join('-')
if #image.save("public/images/image-#{id}-#{first_name}-#{date_mark}.png")
redirect_to user_path(current_user)
end
end
I tried to store the file to S3 by using carrierwave and fog.
I generated ScreenshotUploader: rails g uploader Screenshot
in screenshot_uploader.rb, I uncommented include CarrierWave::MiniMagick; storage :fog ; and as for def store_dir, I just put "uploads/" instead of the default "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id} since there's no model.
Then i modified the ScreensController above:
if #image.save("public/images/image-#{id}-#{first_name}-#{date_mark}.png")
uploader = ScreenshotUploader.new
uploader.store!(#image)
redirect_to user_path(current_user)
end
Result: The screenshot is just saved to the public folder and nothing in S3.

How to store the path to an S3 bucket uploaded file?

I have two models: Organization and Drawing. They have a 1:1 relationship. Drawing contains two variables: organization_id and file_path.
At signup of a new organization, I have my Rails application automatically copy a standard file for that organization to my S3 bucket. File_path in the Drawing model should include the string to the path where the file is stored.
To the organization's controller I have added a reference to the method upload_file, which I have included in the Drawing model.
Controller:
def create
#organization = Organization.new(organizationnew_params)
if #organization.save
Drawing.upload_file(#organization.id)
redirect_to root_url
end
end
Model:
def self.upload_file(id)
s3 = Aws::S3::Resource.new(
credentials: Aws::Credentials.new(ENV['S3_ACCESS_KEY'], ENV['S3_SECRET_KEY']),
region: ENV['AWS_REGION']
)
xmlfile = 'app/assets/other/graph.xml'
key = "uploads/#{id}/xml-#{id}.xml"
obj = s3.bucket('mybucketname').object(key)
obj.upload_file(xmlfile)
Conceptmap.create!(organization_id: id, xml_file: obj.public_url)
end
Question: My question concerns the correctness of xml_file: obj.public_url on the last line. Here I want to save the bucket-path of the saved file. But I wonder if this is the secure way to do this (or should this path be a hash)? How does this work with a carrierwave uploader; is it then automatically hashed? I don't want just anyone to be able to browse to the file and open it.

How to save a raw_data photo using paperclip

I'm using jpegcam to allow a user to take a webcam photo to set as their profile photo. This library ends up posting the raw data to the sever which I get in my rails controller like so:
def ajax_photo_upload
# Rails.logger.info request.raw_post
#user = User.find(current_user.id)
#user.picture = File.new(request.raw_post)
This does not work and paperclip/rails fails when you try to save request.raw_post.
Errno::ENOENT (No such file or directory - ????JFIF???
I've seen solutions that make a temporary file but I'd be curious to know if there is a way to get Paperclip to automatically save the request.raw_post w/o having to make a tempfile. Any elegant ideas or solutions out there?
UGLY SOLUTION (Requires a temp file)
class ApiV1::UsersController < ApiV1::APIController
def create
File.open(upload_path, 'w:ASCII-8BIT') do |f|
f.write request.raw_post
end
current_user.photo = File.open(upload_path)
end
private
def upload_path # is used in upload and create
file_name = 'temp.jpg'
File.join(::Rails.root.to_s, 'public', 'temp', file_name)
end
end
This is ugly as it requires a temporary file to be saved on the server. Tips on how to make this happen w/o the temporary file needing to be saved? Can StringIO be used?
The problem with my previous solution was that the temp file was already closed and therefore could not be used by Paperclip anymore. The solution below works for me. It's IMO the cleanest way and (as per documentation) ensures your tempfiles are deleted after use.
Add the following method to your User model:
def set_picture(data)
temp_file = Tempfile.new(['temp', '.jpg'], :encoding => 'ascii-8bit')
begin
temp_file.write(data)
self.picture = temp_file # assumes has_attached_file :picture
ensure
temp_file.close
temp_file.unlink
end
end
Controller:
current_user.set_picture(request.raw_post)
current_user.save
Don't forget to add require 'tempfile' at the top of your User model file.

Resources