metaprograming String#scan and globals? - ruby-on-rails

My goal is to replace methods in the String class with other methods that do additional work (this is for a research project). This works for many methods by writing code in the String class similar to
alias_method :center_OLD, :center
def center(args*)
r = self.send(*([:center_OLD] + args))
#do some work here
#return something
end
For some methods, I need to handle a Proc as well, which is no problem. However, for the scan method, invoking it has the side effect of setting special global variables from the regular expression match. As documented, these variables are local to the thread and the method.
Unfortunately, some Rails code makes calls to scan which makes use of the $& variable. That variable gets set inside my version of the scan method, but because it's local, it doesn't make it back to the original caller which uses the variable.
Does anyone know a way to work around this? Please let me know if the problem needs clarification.
If it helps at all, all the uses I've seen so far of the $& variable are inside a Proc passed to the scan function, so I can get the binding for that Proc. However, the user doesn't seem to be able to change $& at all, so I don't know how that will help much.
Current Code
class String
alias_method :scan_OLD, :scan
def scan(*args, &b)
begin
sargs = [:scan_OLD] + args
if b.class == Proc
r = self.send(*sargs, &b)
else
r = self.send(*sargs)
end
r
rescue => error
puts error.backtrace.join("\n")
end
end
end
Of course I'll do more things before returning r, but this even is problematic -- so for simplicity we'll stick with this. As a test case, consider:
"hello world".scan(/l./) { |x| puts x }
This works fine both with and without my version of scan. With the "vanilla" String class this produces the same thing as
"hello world".scan(/l./) { puts $&; }
Namely, it prints "ll" and "ld" and returns "hello world". With the modified string class it prints two blank lines (since $& was nil) and then returns "hello world". I'll be happy if we can get that working!

You cannot set $&, because it is derived from $~, the last MatchData.
However, $~ can be set and that actually does what you want.
The trick is to set it in the block binding.
The code is inspired by the old Ruby implementation of Pathname.
(The new code is in C and does not need to care about Ruby frame-local variables)
class String
alias_method :scan_OLD, :scan
def scan(*args, &block)
sargs = [:scan_OLD] + args
if block
self.send(*sargs) do |*bargs|
Thread.current[:string_scan_matchdata] = $~
eval("$~ = Thread.current[:string_scan_matchdata]", block.binding)
yield(*bargs)
end
else
self.send(*sargs)
end
end
end
The saving of the thread-local (well, actually fiber-local) variable seems unnecessary since it is only used to pass the value and the thread never reads any other value than the last one set. It probably is there to restore the original value (most likely nil, because the variable did not exist).
One way to avoid thread-locals at all is to create a setter of $~ as a lambda (but it does create a lambda for each call):
self.send(*sargs) do |*bargs|
eval("lambda { |m| $~ = m }", block.binding).call($~)
yield(*bargs)
end
With any of these, your example works!

I wrote simple code simulating the problem:
"hello world".scan(/l./) { |x| puts x }
"hello world".scan(/l./) { puts $&; }
class String
alias_method :origin_scan, :scan
def scan *args, &b
args.unshift :origin_scan
#mutex ||= Mutex.new
begin
self.send *args do |a|
break if !block_given?
#mutex.synchronize do
p $&
case b.arity
when 0
b.call
when 1
b.call a
end
end
end
rescue => error
p error, error.backtrace.join("\n")
end
end
end
"hello world".scan(/l./) { |x| puts x }
"hello world".scan(/l./) { puts $& }
And found the following. The change of containment of the variable $& became inside a :call function, i.e. on 3-rd step before :call $& contains a valid value, but inside the block it becomes the invalid. I guess this become due to the singularity stack and variable restoration during the change process/thread context, because, probably, :call function can't access the :scan local state.
I see two variants: the first is to avoid to use global variables in the specific function redefinitions, and second, may to dig sources of ruby more deeply.

Related

method.to_proc doesn't return from enclosed function

I was trying to DRY up a Rails controller by extracting a method that includes a guard clause to return prematurely from the controller method in the event of an error. I thought this may be possible using a to_proc, like this pure Ruby snippet:
def foo(string)
processed = method(:breaker).to_proc.call(string)
puts "This step should not be executed in the event of an error"
processed
end
def breaker(string)
begin
string.upcase!
rescue
puts "Well you messed that up, didn't you?"
return
end
string
end
My thinking was that having called to_proc on the breaker method, calling the early return statement in the rescue clause should escape the execution of foo. However, it didn't work:
2.4.0 :033 > foo('bar')
This step should not be executed in the event of an error
=> "BAR"
2.4.0 :034 > foo(2)
Well you messed that up, didn't you?
This step should not be executed in the event of an error
=> nil
Can anyone please
Explain why this doesn't work
Suggest a way of achieving this effect?
Thanks in advance.
EDIT: as people are wondering why the hell I would want to do this, the context is that I'm trying to DRY up the create and update methods in a Rails controller. (I'm trying to be agressive about it as both methods are about 60 LoC. Yuck.) Both methods feature a block like this:
some_var = nil
if (some complicated condition)
# do some stuff
some_var = computed_value
elsif (some marginally less complicated condition)
#error_msg = 'This message is the same in both actions.'
render partial: "show_user_the_error" and return
# rest of controller actions ...
Hence, I wanted to extract this as a block, including the premature return from the controller action. I thought this might be achievable using a Proc, and when that didn't work I wanted to understand why (which I now do thanks to Marek Lipa).
What about
def foo(string)
processed = breaker(string)
puts "This step should not be executed in the event of an error"
processed
rescue ArgumentError
end
def breaker(string)
begin
string.upcase!
rescue
puts "Well you messed that up, didn't you?"
raise ArgumentError.new("could not call upcase! on #{string.inspect}")
end
string
end
After all this is arguably a pretty good use case for an exception.
It seems part of the confusion is that a Proc or lambda for that matter are distinctly different than a closure (block).
Even if you could convert Method#to_proc to a standard Proc e.g. Proc.new this would simply result in a LocalJumpError because the return would be invalid in this context.
You can use next to break out of a standard Proc but the result would be identical to the lambda that you have now.
The reason Method#to_proc returns a lambda is because a lambda is far more representative of a method call than a standard Proc
For Example:
def foo(string)
string
end
bar = ->(string) { string } #lambda
baz = Proc.new {|string| string }
foo
#=> ArgumentError: wrong number of arguments (given 0, expected 1)
bar.()
#=> ArgumentError: wrong number of arguments (given 0, expected 1)
baz.()
#=> nil
Since you are converting a method to a proc object I am not sure why you would also want the behavior to change as this could cause ambiguity and confusion. Please note that for this reason you can not go in the other direction either e.g. lambda(&baz) does not result in a lambda either as metioned Here.
Now that we have explained all of this and why it shouldn't really be done, it is time to remember that nothing is impossible in ruby so this would technically work:
def foo(string)
# place assignment in the guard clause
# because the empty return will result in `nil` a falsey value
return unless processed = method(:breaker).to_proc.call(string)
puts "This step should not be executed in the event of an error"
processed
end
def breaker(string)
begin
string.upcase!
rescue
puts "Well you messed that up, didn't you?"
return
end
string
end
Example

Don't change string value on insert

I have a Model user with the following method:
def number_with_hyphen
number&.insert(8, "-")
end
When I run it several times in my tests I get the following output:
users(:default).number_with_hyphen
"340909-1234"
(byebug) users(:default).number_with_hyphen
"340909--1234"
(byebug) users(:default).number_with_hyphen
"340909---1234"
(byebug) users(:default).number_with_hyphen
"340909----1234"
It changes the number ?Here are the docs https://apidock.com/ruby/v1_9_3_392/String/insert
When I restructure my method to:
def number_with_hyphen
"#{number}".insert(8, "-") if number
end
If works like expected. The output stays the same!
How would you structure the code, how would you perform the insert?
which method should I use instead. Thanks
If you're using the insert method, which in the documentation explicitly states "modifies str", then you will need to avoid doing this twice, rendering it idempotent, or use another method that doesn't mangle data.
One way is a simple regular expression to extract the components you're interested in, ignoring any dash already present:
def number_with_hyphen
if (m = number.match(/\A(\d{8})\-?(\d+)\z/))
[ m[1], m[2] ].join('-')
else
number
end
end
That ends up being really safe. If modified to accept an argument, you can test this:
number = '123456781234'
number_with_hyphen(number)
# => "12345678-1234"
number
# => "123456781234"
number_with_hyphen(number_with_hyphen(number))
# => "12345678-1234"
number_with_hyphen('1234')
# => "1234"
Calling it twice doesn't mangle anything, and any non-conforming data is sent through as-is.
Do a clone of the string:
"#{number}".clone.insert(8, '-')

What does the & do in front of an argument in Ruby?

I'm doing some Ruby Koan exercises. Since i'm quite a newbie, so some code doesn't seem to make sense for me. For example, the & in front of an argument
def method_with_explicit_block(&block)
block.call(10)
end
def test_methods_can_take_an_explicit_block_argument
assert_equal 20, method_with_explicit_block { |n| n * 2 }
add_one = lambda { |n| n + 1 }
assert_equal 11, method_with_explicit_block(&add_one)
end
Why there's a & before block and add_one? To make them global variables or refer them to the previous variable?
Thank you!
In front of a parameter in method definition, the unary prefix ampersand & sigil means: package the block passed to this method as a proper Proc object.
In front of an argument in method call, the unary prefix ampersand & operator means: convert the object passed as an argument to a Proc by sending it the message to_proc (unless it already is a Proc) and "unroll" it into a block, i.e. treat the Proc as if it had been passed directly as a block instead.
Example in case of procs
multiples_of_3 = Proc.new do |n|
n%3 == 0
end
(1..100).to_a.select(&multiples_of_3)
The "&" here is used to convert proc into block.
Another Example
It’s how you can pass a reference to the block (instead of a local variable) to a method. Ruby allows you to pass any object to a method as if it were a block. The method will try to use the passed in object if it’s already a block but if it’s not a block it will call to_proc on it in an attempt to convert it to a block.
Also note that the block part (without the ampersand) is just a name for the reference, you can use whatever name you like if it makes more sense to you.
def my_method(&block)
puts block
block.call
end
my_method { puts "Hello!" }
#<Proc:0x0000010124e5a8#tmp/example.rb:6>
Hello!
As you can see in the example above, the block variable inside my_method is a reference to the block and it can be executed with the call method. call on the block is the same as using yield, some people like to use block.call instead of yield for better readability.

Ruby syntax: break out from 'each.. do..' block

I am developing a Ruby on Rails app. My question is more about Ruby syntax.
I have a model class with a class method self.check:
class Cars < ActiveRecord::Base
...
def self.check(name)
self.all.each do |car|
#if result is true, break out from the each block, and return the car how to...
result = SOME_CONDITION_MEET?(car) #not related with database
end
puts "outside the each block."
end
end
I would like to stop/break out from the each block once the result is true (that's break the each block if car.name is the same as the name parameter once) AND return the car which cause the true result. How to break out in Ruby code?
You can break with the break keyword. For example
[1,2,3].each do |i|
puts i
break
end
will output 1. Or if you want to directly return the value, use return.
Since you updated the question, here the code:
class Car < ActiveRecord::Base
# …
def self.check(name)
self.all.each do |car|
return car if some_condition_met?(car)
end
puts "outside the each block."
end
end
Though you can also use Array#detect or Array#any? for that purpose.
I provide a bad sample code. I am not directly find or check something
from database. I just need a way to break out from the "each" block if
some condition meets once and return that 'car' which cause the true
result.
Then what you need is:
def check(cars, car_name)
cars.detect { |car| car.name == car_name }
end
If you wanted just to know if there was any car with that name then you'd use Enumerable#any?. As a rule of thumb, use Enumerable#each only to do side effects, not perform logic.
you can use include? method.
def self.check(name)
cars.include? name
end
include? returns true if name is present in the cars array else it returns false.
You can use break but what your are trying to do could be done much easier, like this:
def self.check(name)
return false if self.find_by_name(name).nil?
return true
end
This uses the database. You are trying to use Ruby at a place the database can deal with it better.
You can also use break conditional:
break if (car.name == name)
I had to do this exact same thing and I was drawing a blank. So despite this being a very old question, here's my answer:
Note: This answer assumes you don't want to return the item as it exists within the array, but instead do some processing on the item and return the result of that instead. That's how I originally read the question, I realise now that was incorrect - though this approach can be easily modified for that effect (break item insead of break output)
Since returning from blocks is dodgy (nobody likes it, and I think the rules are about to change which makes it even more fraught) this is a much nicer option:
collection.inject(nil) do |_acc, item|
output = expensive_operation(item)
break output if output
end
Note that there are lots of variants; for example, if you don't want an incidental variable, and don't mind starting a second loop in some circumstances, you can invert it like this:
collection.inject(nil) do |acc, item|
break acc if acc
expensive_operation(item)
end

In Ruby, how to write a method to display any object's instance variable names and its values

Given any object in Ruby (on Rails), how can I write a method so that it will display that object's instance variable names and its values, like this:
#x: 1
#y: 2
#link_to_point: #<Point:0x10031b298 #y=20, #x=38>
(Update: inspect will do except for large object it is difficult to break down the variables from the 200 lines of output, like in Rails, when you request.inspect or self.inspect in the ActionView object)
I also want to be able to print <br> to the end of each instance variable's value so as to print them out nicely on a webpage.
the difficulty now seems to be that not every instance variable has an accessor, so it can't be called with obj.send(var_name)
(the var_name has the "#" removed, so "#x" becomes "x")
Update: I suppose using recursion, it can print out a more advanced version:
#<Point:0x10031b462>
#x: 1
#y: 2
#link_to_point: #<Point:0x10031b298>
#x=38
#y=20
I would probably write it like this:
class Object
def all_variables(root=true)
vars = {}
self.instance_variables.each do |var|
ivar = self.instance_variable_get(var)
vars[var] = [ivar, ivar.all_variables(false)]
end
root ? [self, vars] : vars
end
end
def string_variables(vars, lb="\n", indent="\t", current_indent="")
out = "#{vars[0].inspect}#{lb}"
current_indent += indent
out += vars[1].map do |var, ivar|
ivstr = string_variables(ivar, lb, indent, current_indent)
"#{current_indent}#{var}: #{ivstr}"
end.join
return out
end
def inspect_variables(obj, lb="\n", indent="\t", current_indent="")
string_variables(obj.all_variables, lb, indent, current_indent)
end
The Object#all_variables method produces an array containing (0) the given object and (1) a hash mapping instance variable names to arrays containing (0) the instance variable and (1) a hash mapping…. Thus, it gives you a nice recursive structure. The string_variables function prints out that hash nicely; inspect_variables is just a convenience wrapper. Thus, print inspect_variables(foo) gives you a newline-separated option, and print inspect_variables(foo, "<br />\n") gives you the version with HTML line breaks. If you want to specify the indent, you can do that too: print inspect_variables(foo, "\n", "|---") produces a (useless) faux-tree format instead of tab-based indenting.
There ought to be a sensible way to write an each_variable function to which you provide a callback (which wouldn't have to allocate the intermediate storage); I'll edit this answer to include it if I think of something. Edit 1: I thought of something.
Here's another way to write it, which I think is slightly nicer:
class Object
def each_variable(name=nil, depth=0, parent=nil, &block)
yield name, self, depth, parent
self.instance_variables.each do |var|
self.instance_variable_get(var).each_variable(var, depth+1, self, &block)
end
end
end
def inspect_variables(obj, nl="\n", indent="\t", sep=': ')
out = ''
obj.each_variable do |name, var, depth, _parent|
out += [indent*depth, name, name ? sep : '', var.inspect, nl].join
end
return out
end
The Object#each_variable method takes a number of optional arguments, which are not designed to be specified by the user; instead, they are used by the recursion to maintain state. The given block is passed (a) the name of the instance variable, or nil if the variable is the root of the recursion; (b) the variable; (c) the depth to which the recursion has descended; and (d), the parent of the current variable, or nil if said variable is the root of the recursion. The recursion is depth-first. The inspect_variables function uses this to build up a string. The obj argument is the object to iterate through; nl is the line separator; indent is the indentation to be applied at each level; and sep separates the name and the value.
Edit 2: This doesn't really add anything to the answer to your question, but: just to prove that we haven't lost anything in the reimplementation, here's a reimplementation of all_variables in terms of each_variables.
def all_variables(obj)
cur_depth = 0
root = [obj, {}]
tree = root
parents = []
prev = root
obj.each_variable do |name, var, depth, _parent|
next unless name
case depth <=> cur_depth
when -1 # We've gone back up
tree = parents.pop(cur_depth - depth)[0]
when +1 # We've gone down
parents << tree
tree = prev
else # We're at the same level
# Do nothing
end
cur_depth = depth
prev = tree[1][name] = [var, {}]
end
return root
end
I feel like it ought to be shorter, but that may not be possible; because we don't have the recursion now, we have to maintain the stack explicitly (in parents). But it is possible, so the each_variable method works just as well (and I think it's a little nicer).
I see... Antal must be giving the advanced version here...
the short version then probably is:
def p_each(obj)
obj.instance_variables.each do |v|
puts "#{v}: #{obj.instance_variable_get(v)}\n"
end
nil
end
or to return it as a string:
def sp_each(obj)
s = ""
obj.instance_variables.each do |v|
s += "#{v}: #{obj.instance_variable_get(v)}\n"
end
s
end
or shorter:
def sp_each(obj)
obj.instance_variables.map {|v| "#{v}: #{obj.instance_variable_get(v)}\n"}.join
end
This is a quick adaptation of a simple JSON emitter I wrote for another question:
class Object
def inspect!(indent=0)
return inspect if instance_variables.empty?
"#<#{self.class}:0x#{object_id.to_s(16)}\n#{' ' * indent+=1}#{
instance_variables.map {|var|
"#{var}: #{instance_variable_get(var).inspect!(indent)}"
}.join("\n#{' ' * indent}")
}\n#{' ' * indent-=1}>"
end
end
class Array
def inspect!(indent=0)
return '[]' if empty?
"[\n#{' ' * indent+=1}#{
map {|el| el.inspect!(indent) }.join(",\n#{' ' * indent}")
}\n#{' ' * indent-=1}]"
end
end
class Hash
def inspect!(indent=0)
return '{}' if empty?
"{\n#{' ' * indent+=1}#{
map {|k, v|
"#{k.inspect!(indent)} => #{v.inspect!(indent)}"
}.join(",\n#{' ' * indent}")
}\n#{' ' * indent-=1}}"
end
end
That's all the magic, really. Now we only need some simple defaults for some types where a full-on inspect doesn't really make sense (nil, false, true, numbers, etc.):
module InspectBang
def inspect!(indent=0)
inspect
end
end
[Numeric, Symbol, NilClass, TrueClass, FalseClass, String].each do |klass|
klass.send :include, InspectBang
end
Like this?
# Get the instance variables of an object
d = Date.new
d.instance_variables.each{|i| puts i + "<br />"}
Ruby Documentation on instance_variables.
The concept is commonly called "introspection", (to look into oneself).

Resources