Retrieving just file name from AWS S3 in Ruby - ruby-on-rails

I'm making Angular-Rails web app now. I successfully retrieve files from certain path in AWS S3.
Let's say I call below function
#files = bucket.objects.with_prefix('pdf/folder/')
#files.each(:limit => 20) do |file|
puts file.key
end
file.key prints pdf/folder/file1.pdf, pdf/folder.file2.pdf, etc.
I do not want the whole path but just name of files like file1.pdf, file2.pdf, etc.
Is regex the only way or is there a API call for this in AWS S3? I was reading the doc and could not find related API function.

The call you want is probably File#basename:
puts File.basename(file.key)

Related

Rails: Background task for regular feed

I have a new requirement for an existing app and I wanted to ask other developers how to go about developing it. I want to export CSV every 2 days and provide my sales team feeds which they use to bulk import on a designated data storage location, say google drive. they can then check the file and do uploads etc.
I want to run a Heroku scheduler which runs every 2 days and exports data from the app, save the file in a specific format with a specific header to that storage.
I know that I can write a class method which generates the format, use strategy pattern to get a specific header and provide it using respond_to and give a format csv so a user can access that file through a URL but how can I write a rake task which creates the file and uploads it to the specific location I specify ?
Will really appreciate any direction
In Rails rake tasks are usually stored in lib/tasks and have .rake extension. For the CSV part, you can use the CSV API of Ruby. After generating a CSV you can save it locally or upload it to any service you want (e.g. S3, Google Drive, etc). For example, take S3:
# lib/tasks/csv_tasks.rake
namespace :csv do
desc 'Generates new feed'
task :feed do
client = Aws::S3::Client.new(region: 'eu-west-1',
access_key_id: 'access_key_id',
secret_access_key: 'secret_access_key')
feed_csv = CSV.generate do |csv|
# headers
csv << [:first_name, :last_name]
# CSV rows
Employee.find_each do |employee|
csv << [employee.first_name, employee.last_name]
end
end
client.put_object({
body: feed_csv,
bucket: 'my_bucket',
key: 'feed.csv',
})
end
end
Then in Heroku scheduler use the defined task rake csv:feed
You might also consider having a model for your files in order to save their paths and then display them easily in your application.
I advise you to save your S3 or other credentials in the secrets.yml file or the credentials file (Rails 5.2+). To use the AWS SDK for Ruby, add this to your Gemfile:
gem 'aws-sdk', '~> 3'
And require it in the rake task, if needed. For more info about how to work with S3 and ruby, read here.

Retrieving files from AWS S3 in Ruby

I uploaded multiple PDF files in following path (user/pdf/) in AWS S3. So that path for each file is going to be like user/pdf/file1.pdf, user/pdf/file2.pdf, etc.
In my website(Angular front-end and Rails backend), I'm trying to do 3 things.
1) Retrieving files in certain path (user/pdf/).
2) Make a view which lists names of the files I retrieved from certain path.
3) Let users to click the name of the file and it will open the file using S3 endpoint
4) Delete the file by clicking a button.
I was looking into AWS S3 doc, but I could not find related API calls from the doc. Would love to get some help on performing above actions.
you should review the ruby S3 sdk doc
listing objects from a bucket
# enumerate ALL objects in the bucket (even if the bucket contains
# more than 1k objects)
bucket.objects.each do |obj|
puts obj.key
end
# enumerate at most 20 objects with the given prefix
bucket.objects.with_prefix('photos/').each(:limit => 20) do |photo|
puts photo.key
end
getting an object
# makes no request, returns an AWS::S3::S3Object
obj = bucket.objects['key']
deleting an object
bucket.objects.delete('abc')

Carrierwave + Fog + S3 remove file without going through a model

I am building an application that has a chat component to it. The application allows users to upload files to the chat. The chat is all javascript but i wanted to use Carrierwave for the uploads because i am using it elsewhere in the application. I am doing the handling of the uploads through AJAX so that i can get into Rails land and let Carrierwave take over.
I have been able to get the chat to successfully upload the files to the correct location in my S3 bucket. The thing i can't figure out is how to delete the files. Here is my code the uploads the files - this is the method that is called from the route that the AJAX call hits.
def upload
file = File.open(params[:file_0].tempfile)
uploader = ChatUploader.new
uploader.store!(file)
end
There is little to no documentation with Carrierwave on how to upload files without going through a model and basically NO documentation on how to remove files without going through a model. I assume it is possible though - i just need to know what to call. So i guess my question is how do i delete files?
UPDATE (11/23)
I got the code to save and delete files from S3 using these methods:
# code to save the file
def upload
file = File.open(params[:file_0].tempfile)
uploader = ChatUploader.new
uploader.store!(file)
uploader.store_path()
end
# code to remove files
def remove_file
file = params[:file]
uploader = ChatUploader.new
uploader.retrieve_from_store!(file)
uploader.remove!
end
My only issue now is that the filename for the uploaded file is not correct. It saves all files with a "RackMultipart" and then some numbers which look like a date, time, and identifier? (example: RackMultipart20141123-17740-1tq4j1g) Need to try and use the original filename plus maybe a timestamp for uniqueness.
I believe it has something to do with these two lines:
file = File.open(params[:file_0].tempfile)
and
uploader.store!(file)

Exporting static JSON data to Amazon S3 and retrieving it through AJAX

On my Rails 3 application, I would like to export static JSON data to an Amazon S3 bucket, which can be later retrieved and parsed by an AJAX call from said application.
The JSON will be generated from the app's database.
My design requirements probably will only need something like a rake task to initiate the export to S3. Every time the rake task is initiated, it'll overwrite the files. Preferably the file name will correspond to the ID number of the record where the JSON data is generated from.
Does anyone have any experience with this and can point me in the right direction?
This can be accomplished with the aws-sdk gem.
Your task could be broken into two basic steps: 1) generate a temporary local file with your json data, 2) upload to S3. A very basic, procedural example of this:
require 'aws-sdk'
# generate local file
record = Record.find(1)
file_name = "my-json-data-#{record.id}"
local_file_path = "/tmp/#{file_name}"
File.open(local_file_path, 'w') do |file|
file.write(record.to_json)
end
# upload to S3
s3 = AWS::S3.new(
:access_key_id => 'YOUR_ACCESS_KEY_ID',
:secret_access_key => 'YOUR_SECRET_ACCESS_KEY')
bucket = s3.buckets['my-s3-bucket-key']
object = bucket.objects[file_name]
object.write(Pathname.new(local_file_path))
Check out the S3Object docs for more info.

Rails: allow download of files stored on S3 without showing the actual S3 URL to user

I have a Rails application hosted on Heroku. The app generates and stores PDF files on Amazon S3. Users can download these files for viewing in their browser or to save on their computer.
The problem I am having is that although downloading of these files is possible via the S3 URL (like "https://s3.amazonaws.com/my-bucket/F4D8CESSDF.pdf"), it is obviously NOT a good way to do it. It is not desirable to expose to the user so much information about the backend, not to mention the security issues that rise.
Is it possible to have my app somehow retrieve the file data from S3 in a controller, then create a download stream for the user, so that the Amazon URL is not exposed?
You can create your s3 objects as private and generate temporary public urls for them with url_for method (aws-s3 gem). This way you don't stream files through your app servers, which is more scalable. It also allows putting session based authorization (e.g. devise in your app), tracking of download events, etc.
In order to do this, change direct links to s3 hosted files into links to controller/action which creates temporary url and redirects to it. Like this:
class HostedFilesController < ApplicationController
def show
s3_name = params[:id] # sanitize name here, restrict access to only some paths, etc
AWS::S3::Base.establish_connection!( ... )
url = AWS::S3::S3Object.url_for(s3_name, YOUR_BUCKET, :expires_in => 2.minutes)
redirect_to url
end
end
Hiding of amazon domain in download urls is usually done with DNS aliasing. You need to create CNAME record aliasing your subdomain, e.g. downloads.mydomain, to s3.amazonaws.com. Then you can specify :server option in AWS::S3::Base.establish_connection!(:server => "downloads.mydomain", ...) and S3 gem will use it for generating links.
Yes, this is possible - just fetch the remote file with Rails and either store it temporarily on your server or send it directly from the buffer. The problem with this is of course the fact that you need to fetch the file first before you can serve it to the user. See this thread for a discussion, their solution is something like this:
#environment.rb
require 'open-uri'
#controller
def index
data = open(params[:file])
send_data data, :filename => params[:name], ...
end
This issue is also somewhat related.
First you need create a CNAME in your domain, like explain here.
Second you need create a bucket with the same name that you put in CNAME.
And to finish you need add this configurations in your config/initializers/carrierwave.rb:
CarrierWave.configure do |config|
...
config.asset_host = 'http://bucket_name.your_domain.com'
config.fog_directory = 'bucket_name.your_domain.com'
...
end

Resources