I want to write a function that allows users to match data based on a regexp, but I am concerned about sanitation of the user strings. I know with SQL queries you can use bind variables to avoid SQL injection attacks, but I am not sure if there's such a mechanism for regexps. I see that there's Regexp.escape, but I want to allow valid regexps.
Here is is the sample function:
def tagged?(text)
tags.each do |tag|
return true if text =~ /#{tag.name}/i
end
return false
end
Since I am just matching directly on tag.name is there a chance that someone could insert a Proc call or something to break out of the regexp and cause havoc?
Any advice on best practice would be appreciated.
Interpolated strings in a Regexp are not executed, but do generate annoying warnings:
/#{exit -3}/.match('test')
# => exits
foo = '#{exit -3}'
/#{foo}/.match('test')
# => warning: regexp has invalid interval
# => warning: regexp has `}' without escape
The two warnings seem to pertain to the opening #{ and the closing } respectively, and are independent.
As a strategy that's more efficient, you might want to sanitize the list of tags into a combined regexp you can run once. It is generally far less efficient to construct and test against N regular expressions than 1 with N parts.
Perhaps something along the lines of this:
class Taggable
def tags
#tags
end
def tags=(value)
#tags = value
#tag_regexp = Regexp.new(
[
'^(?:',
#tags.collect do |tag|
'(?:' + tag.sub(/\#\{/, '\\#\\{').sub(/([^\\])\}/, '\1\\}') + ')'
end.join('|'),
')$'
].to_s,
Regexp::IGNORECASE
)
end
def tagged?(text)
!!text.match(#tag_regexp)
end
end
This can be used like this:
e = Taggable.new
e.tags = %w[ #{exit-3} .*\.gif .*\.png .*\.jpe?g ]
puts e.tagged?('foo.gif').inspect
If the exit call was executed, the program would halt there, but it just interprets that as a literal string. To avoid warnings it is escaped with backslashes.
You should probably create an instance of the Regexp class instead.
def tagged?(text)
return tags.any? { |tag| text =~ Regexp.new(tag.name, Regexp::IGNORECASE) }
end
Related
I have a Model user with the following method:
def number_with_hyphen
number&.insert(8, "-")
end
When I run it several times in my tests I get the following output:
users(:default).number_with_hyphen
"340909-1234"
(byebug) users(:default).number_with_hyphen
"340909--1234"
(byebug) users(:default).number_with_hyphen
"340909---1234"
(byebug) users(:default).number_with_hyphen
"340909----1234"
It changes the number ?Here are the docs https://apidock.com/ruby/v1_9_3_392/String/insert
When I restructure my method to:
def number_with_hyphen
"#{number}".insert(8, "-") if number
end
If works like expected. The output stays the same!
How would you structure the code, how would you perform the insert?
which method should I use instead. Thanks
If you're using the insert method, which in the documentation explicitly states "modifies str", then you will need to avoid doing this twice, rendering it idempotent, or use another method that doesn't mangle data.
One way is a simple regular expression to extract the components you're interested in, ignoring any dash already present:
def number_with_hyphen
if (m = number.match(/\A(\d{8})\-?(\d+)\z/))
[ m[1], m[2] ].join('-')
else
number
end
end
That ends up being really safe. If modified to accept an argument, you can test this:
number = '123456781234'
number_with_hyphen(number)
# => "12345678-1234"
number
# => "123456781234"
number_with_hyphen(number_with_hyphen(number))
# => "12345678-1234"
number_with_hyphen('1234')
# => "1234"
Calling it twice doesn't mangle anything, and any non-conforming data is sent through as-is.
Do a clone of the string:
"#{number}".clone.insert(8, '-')
I want to perform an action if a string is contained, non-case-sensitively, in another string.
So my if statement would look something like this:
#a = "search"
if #a ILIKE "fullSearch"
#do stuff
end
You can use the include? method. So in this case:
#var = 'Search'
if var.include? 'ear'
#action
end
Remember include? method is case-sensitive. So if you use something like include? 'sea' it would return false. You may want to do a downcase before calling include?()
#var = 'Search'
if var.downcase.include? 'sea'
#action
end
Hope that helped.
There are many ways to get there. Here are three:
'Foo'.downcase.include?('f') # => true
'Foo'.downcase['f'] # => "f"
Those are documented in the String documentation which you need to become very familiar with if you're going to program in Ruby.
'Foo'[/f/i] # => "F"
This is a mix of String's [] slice shortcut and regular expressions. I'd recommend one of the first two because they're faster, but for thoroughness I added it because people like hitting things with the regex hammer. Regexp contains documentation for /f/i.
You'll notice that they return different things. Ruby considers anything other than false or nil as true, AKA "truthiness", so all three are returning a true value, and, as a result you could use them in conditional tests.
You can use a regexp with i option. i for insensitive I think.
a = "fullSearch"
a =~ /search/i
=> 4
a =~ /search/
=> nil
Or you could downcase your string and check if it's present in the other
a = "fullSearch"
a.downcase.include?('search')
=> true
So what I am doing is iterating over various versions of snippet of code (for e.g. Associations.rb in Rails).
What I want to do is just extract one snippet of the code, for example the has_many method:
def has_many(name, scope = nil, options = {}, &extension)
reflection = Builder::HasMany.build(self, name, scope, options, &extension)
Reflection.add_reflection self, name, reflection
end
At first I was thinking of just searching this entire file for the string def has_many and then saving everything between that string and end. The obvious issue with this, is that different versions of this file can have multiple end strings within the method.
For instance, whatever I come up with for the above snippet, should also work for this one too:
def has_many(association_id, options = {})
validate_options([ :foreign_key, :class_name, :exclusively_dependent, :dependent, :conditions, :order, :finder_sql ], options.keys)
association_name, association_class_name, association_class_primary_key_name =
associate_identification(association_id, options[:class_name], options[:foreign_key])
require_association_class(association_class_name)
if options[:dependent] and options[:exclusively_dependent]
raise ArgumentError, ':dependent and :exclusively_dependent are mutually exclusive options. You may specify one or the other.' # ' ruby-mode
elsif options[:dependent]
module_eval "before_destroy '#{association_name}.each { |o| o.destroy }'"
elsif options[:exclusively_dependent]
module_eval "before_destroy { |record| #{association_class_name}.delete_all(%(#{association_class_primary_key_name} = '\#{record.id}')) }"
end
define_method(association_name) do |*params|
force_reload = params.first unless params.empty?
association = instance_variable_get("##{association_name}")
if association.nil?
association = HasManyAssociation.new(self,
association_name, association_class_name,
association_class_primary_key_name, options)
instance_variable_set("##{association_name}", association)
end
association.reload if force_reload
association
end
# deprecated api
deprecated_collection_count_method(association_name)
deprecated_add_association_relation(association_name)
deprecated_remove_association_relation(association_name)
deprecated_has_collection_method(association_name)
deprecated_find_in_collection_method(association_name)
deprecated_find_all_in_collection_method(association_name)
deprecated_create_method(association_name)
deprecated_build_method(association_name)
end
Assuming that each value is stored as text in some column in my db.
How do I approach this, using Ruby's string methods or should I be approaching this another way?
Edit 1
Please note that this question relates specifically to string manipulation via using a Regex, without a parser.
As discussed, this should be done with a parser like Ripper.
However, to answer if it can be done with string methods, I will match the syntax with a regex, provided:
You can rely on indentation i.e. the string has the exact same characters before "def" and before "end".
There are no multiline strings in between that could simulate an "end" with the same indentation. That includes multine strings, HEREDOC, %{ }, etc.
Code
regex = /^
(\s*) # matches the indentation (we'll backreference later)
def\ +has_many\b # literal "def has_many" with a word boundary
(?:.*+\n)*? # match whole lines - as few as possible
\1 # matches the same indentation as the def line
end\b # literal "end"
/x
subject = %q|
def has_many(name, scope = nil, options = {}, &extension)
if association.nil?
instance_variable_set("##{association_name}", association)
end
end|
#Print matched text
puts subject.to_enum(:scan,regex).map {$&}
ideone demo
The regex relies on:
Capturing the whitespace (indentation) with the group (\s*),
followed by the literal def has_many.
It then consumes as few lines as it can with (?:.*+\n)*?.
Notice that .*+\n matches a whole line
and (?:..)*? repeats it 0 or more times. Also, the last ? makes the repetition lazy (as few as possible).
It will consume lines until it matches the following condition...
\1 is a backreference, storing the text matched in (1), i.e. the exact same indentation as the first line.
Followed by end obviously.
Test in Rubular
I am attempting to write my own solution to a Ruby exercise from Rubymonk where the purpose is to create three methods (add, subtract, and calculate) so when 'calculate' is called you can determine whether or not numbers are added or subtracted based on what is passed in. I am receiving the following error:
main:11: syntax error, unexpected '=', expecting ')' def calculate(*numbers, options={})
Can anyone tell me what the issue is with my code? Thanks for any and all help!
def add(*numbers)
numbers.inject(0) {|sum, number| sum + number}
end
def subtract(*numbers)
numbers.inject{|diff, number| diff - number}
end
def calculate(*numbers, options={})
result = add(numbers) if options.empty?
result = add(numbers) if options[:add]
result = subtract(numbers) if options[:subtract]
result
end
def calculate(*numbers, options={})
is not a valid method definition b/c *numbers takes the place a variable number of arguments. You have two options as I see it -
def calculate(options={}, *numbers)
or
def calculate(*args)
numbers, options = args[0..-2], args[-1] || {}
if you want to keep the same argument order
The splat argument *numbers needs to be the last argument. Otherwise, how would Ruby know when to treat the last argument as options or as the last number?
You can use (*numbers, options) (without a default value), but that would require that you always pass an options hash to the method (otherwise your last number will be set as the options variable instead).
Try this way:
def calculate(options={},*numbers)
Using optional arguments after the fully optional argument ( the * notation) do not work since it creates an ambiguity.
Read more at:
http://www.skorks.com/2009/08/method-arguments-in-ruby/
You can't use both a splat and a param with a default as last argument, this is too ambiguous for the parser (how to know that the last arg passed is meant to be the options?)
you can work around this in many ways ; one idiom from rails (active support) is :
def calculate(*args)
options = args.extract_options!
# ...
end
where extract_options! is a monkey-patch to Array from ActiveSupport defined as follow :
def extract_options!
last.is_a?(::Hash) ? pop : {}
end
as a side note :
an options hash is not really usefull here. you could pass in just a symbol as first argument, maybe.
if you use a hash, logic could be simpler :
def calculate(*args)
options = args.extract_options!
method = options.fetch(:method, :add)
send method, *args
end
on add, you don't need inject(0), injectuses the first element of your array as a first "memo" value if you don't provide one
you can pass a symbol to inject, which will be the method called on your "memo" value, with "next value" as argument :
(1..10).inject(:+)
# this is the same as
(1..10).inject{ |memo, next| memo + next }
# or, more exactly
(1..10).inject{ |memo, next| memo.send :+, next }
Given any object in Ruby (on Rails), how can I write a method so that it will display that object's instance variable names and its values, like this:
#x: 1
#y: 2
#link_to_point: #<Point:0x10031b298 #y=20, #x=38>
(Update: inspect will do except for large object it is difficult to break down the variables from the 200 lines of output, like in Rails, when you request.inspect or self.inspect in the ActionView object)
I also want to be able to print <br> to the end of each instance variable's value so as to print them out nicely on a webpage.
the difficulty now seems to be that not every instance variable has an accessor, so it can't be called with obj.send(var_name)
(the var_name has the "#" removed, so "#x" becomes "x")
Update: I suppose using recursion, it can print out a more advanced version:
#<Point:0x10031b462>
#x: 1
#y: 2
#link_to_point: #<Point:0x10031b298>
#x=38
#y=20
I would probably write it like this:
class Object
def all_variables(root=true)
vars = {}
self.instance_variables.each do |var|
ivar = self.instance_variable_get(var)
vars[var] = [ivar, ivar.all_variables(false)]
end
root ? [self, vars] : vars
end
end
def string_variables(vars, lb="\n", indent="\t", current_indent="")
out = "#{vars[0].inspect}#{lb}"
current_indent += indent
out += vars[1].map do |var, ivar|
ivstr = string_variables(ivar, lb, indent, current_indent)
"#{current_indent}#{var}: #{ivstr}"
end.join
return out
end
def inspect_variables(obj, lb="\n", indent="\t", current_indent="")
string_variables(obj.all_variables, lb, indent, current_indent)
end
The Object#all_variables method produces an array containing (0) the given object and (1) a hash mapping instance variable names to arrays containing (0) the instance variable and (1) a hash mapping…. Thus, it gives you a nice recursive structure. The string_variables function prints out that hash nicely; inspect_variables is just a convenience wrapper. Thus, print inspect_variables(foo) gives you a newline-separated option, and print inspect_variables(foo, "<br />\n") gives you the version with HTML line breaks. If you want to specify the indent, you can do that too: print inspect_variables(foo, "\n", "|---") produces a (useless) faux-tree format instead of tab-based indenting.
There ought to be a sensible way to write an each_variable function to which you provide a callback (which wouldn't have to allocate the intermediate storage); I'll edit this answer to include it if I think of something. Edit 1: I thought of something.
Here's another way to write it, which I think is slightly nicer:
class Object
def each_variable(name=nil, depth=0, parent=nil, &block)
yield name, self, depth, parent
self.instance_variables.each do |var|
self.instance_variable_get(var).each_variable(var, depth+1, self, &block)
end
end
end
def inspect_variables(obj, nl="\n", indent="\t", sep=': ')
out = ''
obj.each_variable do |name, var, depth, _parent|
out += [indent*depth, name, name ? sep : '', var.inspect, nl].join
end
return out
end
The Object#each_variable method takes a number of optional arguments, which are not designed to be specified by the user; instead, they are used by the recursion to maintain state. The given block is passed (a) the name of the instance variable, or nil if the variable is the root of the recursion; (b) the variable; (c) the depth to which the recursion has descended; and (d), the parent of the current variable, or nil if said variable is the root of the recursion. The recursion is depth-first. The inspect_variables function uses this to build up a string. The obj argument is the object to iterate through; nl is the line separator; indent is the indentation to be applied at each level; and sep separates the name and the value.
Edit 2: This doesn't really add anything to the answer to your question, but: just to prove that we haven't lost anything in the reimplementation, here's a reimplementation of all_variables in terms of each_variables.
def all_variables(obj)
cur_depth = 0
root = [obj, {}]
tree = root
parents = []
prev = root
obj.each_variable do |name, var, depth, _parent|
next unless name
case depth <=> cur_depth
when -1 # We've gone back up
tree = parents.pop(cur_depth - depth)[0]
when +1 # We've gone down
parents << tree
tree = prev
else # We're at the same level
# Do nothing
end
cur_depth = depth
prev = tree[1][name] = [var, {}]
end
return root
end
I feel like it ought to be shorter, but that may not be possible; because we don't have the recursion now, we have to maintain the stack explicitly (in parents). But it is possible, so the each_variable method works just as well (and I think it's a little nicer).
I see... Antal must be giving the advanced version here...
the short version then probably is:
def p_each(obj)
obj.instance_variables.each do |v|
puts "#{v}: #{obj.instance_variable_get(v)}\n"
end
nil
end
or to return it as a string:
def sp_each(obj)
s = ""
obj.instance_variables.each do |v|
s += "#{v}: #{obj.instance_variable_get(v)}\n"
end
s
end
or shorter:
def sp_each(obj)
obj.instance_variables.map {|v| "#{v}: #{obj.instance_variable_get(v)}\n"}.join
end
This is a quick adaptation of a simple JSON emitter I wrote for another question:
class Object
def inspect!(indent=0)
return inspect if instance_variables.empty?
"#<#{self.class}:0x#{object_id.to_s(16)}\n#{' ' * indent+=1}#{
instance_variables.map {|var|
"#{var}: #{instance_variable_get(var).inspect!(indent)}"
}.join("\n#{' ' * indent}")
}\n#{' ' * indent-=1}>"
end
end
class Array
def inspect!(indent=0)
return '[]' if empty?
"[\n#{' ' * indent+=1}#{
map {|el| el.inspect!(indent) }.join(",\n#{' ' * indent}")
}\n#{' ' * indent-=1}]"
end
end
class Hash
def inspect!(indent=0)
return '{}' if empty?
"{\n#{' ' * indent+=1}#{
map {|k, v|
"#{k.inspect!(indent)} => #{v.inspect!(indent)}"
}.join(",\n#{' ' * indent}")
}\n#{' ' * indent-=1}}"
end
end
That's all the magic, really. Now we only need some simple defaults for some types where a full-on inspect doesn't really make sense (nil, false, true, numbers, etc.):
module InspectBang
def inspect!(indent=0)
inspect
end
end
[Numeric, Symbol, NilClass, TrueClass, FalseClass, String].each do |klass|
klass.send :include, InspectBang
end
Like this?
# Get the instance variables of an object
d = Date.new
d.instance_variables.each{|i| puts i + "<br />"}
Ruby Documentation on instance_variables.
The concept is commonly called "introspection", (to look into oneself).