Recursive directory search in Rails - ruby-on-rails

So I have a function here that should take the path to an archive.zip as an argument, and recursively dive in to every sub directory until it finds a file with the extension .html
def path_to_project
"#{recursive_find(File.dirname(self.folder.path))}"
end
path_to_project is used to apply this recursive_find process on the fly as a string, since it's used repeatedly in the process as a whole.
def recursive_find(proj_path)
Dir.glob("#{proj_path}/*") do |object_path|
if object_path.split(".").last == "html"
#found_it = File.dirname(object_path)
else
recursive_find(object_path)
end
end
#found_it
end
Anyways, I have two questions for the smart folks of stackoverflow-
1- Is my use of the #found_it instance variable correct? , perhaps I should use attr_accessor :found_it instead? Obviously named something else that isn't stupid.. maybe :html_file.
Perhaps -
unless #found_it
# do the whole recursive thing
end
return #found_it
# I don't actually have to return the variable right?
2 - Could my recursive method be better? I realize this is pretty open ended so by all means, flame away you angry dwellers. I gladly accept your harsh criticisms and whole heartedly appreciate your good advices :)

If you don't need to use recursion you could just do
Dir["#{proj_path}"/**/*.html"] That should give you a list of all the files that have html extension.
As far as your questions: Your use of #found_it depends on the bigger scope of things. Where is this function defined a class or a module? The name of the variable itself could more meaningful like #html_file or maybe what the context of the file is like #result_page.

Ruby's Find class is a very easy to use, and scalable solution. It descends into a directory hierarchy, returning each element as it is encountered. You can tell it to recurse into particular directories, to ignore files and directories based on attributes, and it is very fast.
This is the example from the documentation:
require 'find'
total_size = 0
Find.find(ENV["HOME"]) do |path|
if FileTest.directory?(path)
if File.basename(path)[0] == ?.
Find.prune # Don't look any further into this directory.
else
next
end
else
total_size += FileTest.size(path)
end
end
I use this to do several scans of directories containing thousands of files. It's easily as fast as using a glob.

Related

Monkey patching a core class with business logic with Rails

I have a monkeypatched of ActiveRecord find with some business logic, for example:
# lib/core_extensions/active_record/finder_methods/finder.rb
module ActiveRecord
module FinderMethods
def find(*args)
return super if block_given?
#... business logic code => my_error_control = true
raise "My Error" if my_error_control
retorn = find_with_ids(*args)
end
end
end
retorn
I have not seen many examples like this, and this causes me a doubt:
Where should finder.rb be?
In this example, this file is in lib/core_extensions/... but if it contains business logic, I think finder.rb should lives in the folder app/core_extensions/ isn't it?
Edited, after Sergio Answer
things like this, are a bad practice?
# lib/core_extensions/nil_class/image_attributes.rb
# suport for product images attributes
class NilClass
def main_image(size,evita_video)
"/images/paperclip_missing/original/missing.png"
end
end
Where should finder.rb be?
Ultimately, it doesn't matter. It only matters that this code gets loaded. This mix of patching base libraries and adding business logic there looks like something that MUST be documented thoroughly (in the project's wiki or something like that). And if it is documented, then it doesn't matter. The code is where the documentation says it is.
That being out of the way, here's a design suggestion:
when user seeks a Family Family.find(params[family_id],session[:company_id]), this find will compare the company of the family result family.company witht the parameter
Why not do something like this:
family = current_company.families.find(params[:family_id])
where current_company can be defined as #current_company ||= Company.find(session[:company_id])
Here, if this company doesn't have this family, you'll get an exception.
Same effect*, only without any patching. Much more futureproof. You can even add a couple of rubocop rules to ensure that you never write a naked Family.find.
* it's not like you add that patch and rest of your code magically acquires super-powers. No. You still have to change all the finders, to pass that company id.
It's the first time I see such case :). I'd put it in app/core_extensions and check if live reloading works correctly with it. If not, I'd move it to lib/. (It's just a heuristic)
Edit:
Instead of extending NilClass I'd rather use regular NullObjects. It's really less surprising and easier to understand.
https://robots.thoughtbot.com/rails-refactoring-example-introduce-null-object

Ruby Class from .rb file

I like to read ruby files from the filesystem and get the actual ruby class
Dir["app/controllers/admin/*.rb"].select{ |f|
require File.expand_path(f)
#how to turn 'f' into an actual class
}
The problem I have is that both Kernel.load or require just respond with a boolean. Is there a way to get the actual class. I know that I can use the file path to determine the name, but I like not to deal with namespaces. How can I do that?
First, I'm going to tell you up front that this is probably a bad idea. Files in Ruby have no relationship to classes whatsoever. A file can define one class, no classes, or many classes, and it can even define classes dynamically based on arbitrary conditions. Additionally, class definitions might be spread across multiple files, and classes can be altered dynamically at runtime. For this reason, determining reliably whether a class is defined in a file is a difficult task, to say the least.
That said, here's one way you might approach the problem. Note that this solution is very hacky, won't work in all cases, and it can load the same file more than once if you're not careful:
module ClassLoader
def self.load_classes(file)
context = Module.new
context.class_eval(File.read(file), file)
context.constants.map{|constant| [constant, context.const_get(constant)]}.to_h
end
end
Usage:
./test_file.rb:
if rand < 0.5
class A
end
else
class B
end
end
class C
end
Your code:
ClassLoader.load_classes('./test_file.rb') #=> {:A=>#<Module:0x9a3c128>::A, :C=>#<Module:0x9a3c128>::C}
Alternately, if you're using Rails class names can often be inferred from the file name. This is somewhat more dependable, since it relies on the same conventions that Rails does for autoloading constants:
Dir["app/controllers/admin/*.rb"].select{ |f|
File.basename(f).camelize.constantize
}

Creating thread-safe non-deleting unique filenames in ruby/rails

I'm building a bulk-file-uploader. Multiple files are uploaded in individual requests, and my UI provides progress and success/fail. Then, once all files are complete, a final request processes/finalizes them. For this to work, I need to create many temporary files that live longer than a single request. Of course I also need to guarantee filenames are unique across app instances.
Normally I would use Tempfile for easy unique filenames, but in this case it won't work because the files need to stick around until another request comes in to further process them. Tempfile auto-unlinks files when they're closed and garbage collected.
An earlier question here suggests using Dir::Tmpname.make_tmpname but this seems to be undocumented and I don't see how it is thread/multiprocess safe. Is it guaranteed to be so?
In c I would open the file O_EXCL which will fail if the file exists. I could then keep trying until I successfully get a handle on a file with a truly unique name. But ruby's File.open doesn't seem to have an "exclusive" option of any kind. If the file I'm opening already exists, I have to either append to it, open for writing at the end, or empty it.
Is there a "right" way to do this in ruby?
I have worked out a method that I think is safe, but is seems overly complex:
# make a unique filename
time = Time.now
filename = "#{time.to_i}-#{sprintf('%06d', time.usec)}"
# make tempfiles (this is gauranteed to find a unique creatable name)
data_file = Tempfile.new(["upload", ".data"], UPLOAD_BASE)
# but the file will be deleted automatically, which we don't want, so now link it in a stable location
count = 1
loop do
begin
# File.link will raise an exception if the destination path exists
File.link(data_file.path, File.join(UPLOAD_BASE, "#{filename}-#{count}.data"))
# so here we know we created a file successfully and nobody else will take it
break
rescue Errno::EEXIST
count += 1
end
end
# now unlink the original tempfiles (they're still writeable until they're closed)
data_file.unlink
# ... write to data_file and close it ...
NOTE: This won't work on Windows. Not a problem for me, but reader beware.
In my testing this works reliably. But again, is there a more straightforward way?
I would use SecureRandom.
Maybe something like:
p SecureRandom.uuid #=> "2d931510-d99f-494a-8c67-87feb05e1594"
or
p SecureRandom.hex #=> "eb693ec8252cd630102fd0d0fb7c3485"
You can specify the length, and count on an almost impossibly small chance of collision.
I actually found the answer after some digging. Of course the obvious approach is to see what Tempfile itself does. I just assumed it was native code, but it is not. The source for 1.8.7 can be found here for instance.
As you can see, Tempfile uses an apparently undocumented file mode of File::EXCL. So my code can be simplified substantially:
# make a unique filename
time = Time.now
filename = "#{time.to_i}-#{sprintf('%06d', time.usec)}"
data_file = nil
count = 1
loop do
begin
data_file = File.open(File.join(UPLOAD_BASE, "#{filename}-#{count}.data"), File::RDWR|File::CREAT|File::EXCL)
break
rescue Errno::EEXIST
count += 1
end
end
# ... write to data_file and close it ...
UPDATE And now I see that this is covered in a prior thread:
How do open a file for writing only if it doesn't already exist in ruby
So maybe this whole question should be marked a duplicate.

Extending/modifying library

I am using RMagick and I don't like one thing:
When I do:
Magick::ImageList.new(path)
path has always to be a local file. So, in my code I have many times repeating this:
if URI(path).host.nil?
Magick::ImageList.new(path)
else
url_image = open(path)
image = Magick::ImageList.new
image.from_blob(url_image.read)
end
How should I manage that code in order to avoid repeating everytime I want to create a new Magick::ImageList object? I am using Rails by the way.
I would suggest wrapping the library with your own class, adding the functionality there. This has the added benefit of keeping the logic all in one place, and letting you customize the functionality to fit your domain better.
Perhaps something along these lines:
class MySuperRadImageList
def self.open(path)
image_list = if URI(path).host.nil?
Magick::ImageList.new(path)
else
Magick::ImageList.new.from_blob(open(path).read)
end
self.new(image_list)
end
def initialize(image_list)
# ...
end
end
I would recommend refactoring the above code, but wanted to show you an concrete example of what I was suggesting (especially that line in the else clause o.O ).

How do you store custom constants in Rails 4?

I made some regular expressions for email, bitmessage etc. and put them as constants to
#config/initializers/regexps.rb
REGEXP_EMAIL = /\A([^#\s]+)#((?:[-a-z0-9]+\.)+[a-z]{2,})\z/i
REGEXP_BITMESSAGE = /\ABM-[a-zA-Z1-9&&[^OIl]]{32,34}\z/
and use it like
if #user.contact =~ REGEXP_EMAIL
elsif #user.contact =~ REGEXP_BITMESSAGE
Is that's good practice? What's the best way to store them?
It makes sense, that's one of the possible approaches. The only downside of this approach, is that the constants will pollute the global namespace.
The approach that I normally prefer is to define them inside the application namespace.
Assuming your application is called Fooapp, then you already have a Fooapp module defined by Rails (see config/application).
I normally create a fooapp.rb file inside lib like the following
module Fooapp
end
and I drop the constants inside. Also make sure to require it at the bottom of you application.rb file
require 'fooapp'
Lazy-loading of the file will not work in this case, because the Fooapp module is already defined.
When the number of constants become large enough, you can more them into a separate file, for example /lib/fooapp/constants.rb. This last step is just a trivial improvement to group all the constants into one simple place (I tend to use constants a lot to replace magic numbers or for optimization, despite Ruby 2.1 Frozen String literal improvements will probably let me remove several constants).
One more thing. In your case, if the regexp is specific to one model, you can store it inside the model itself and create a model method
class User
REGEXP_EMAIL = /\A([^#\s]+)#((?:[-a-z0-9]+\.)+[a-z]{2,})\z/i
REGEXP_BITMESSAGE = /\ABM-[123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ]{32,34}\z/
def contact_is_email?
contact =~ REGEXP_EMAIL
end
end

Resources