Regexp union in ruby escapes my original regex

Regexp union in ruby escapes my original regex - ruby-on-rails

I've got multiple regexes and I want to use Regexp.union to combine them in one big regex so I have this regex to show as an example:
^image\d*$
So I try this :
regex = %w(^image\d*$)
=> ["^image\\d*$"]
re = Regexp.union(regex)
=> /\^image\\d\*\$/
And it escapes my regex to /\^image\\d\*\$/ so when I try the basic case it doesn't match :
"image0".match(re)
=> nil
How can I get arround this?

Pass Regexp object. %w(...) is string literal. Use %r(...) or /.../ for regular expression literal.
regex = %r(^image\d*$)
# => /^image\d*$/
Regexp.union(regex)
# => /^image\d*$/
array_of_regexs = [/a/, /b/, /c/]
# => [/a/, /b/, /c/]
Regexp.union(array_of_regexs)
# => /(?-mix:a)|(?-mix:b)|(?-mix:c)/

Related

Match dynamic data with match method

I am trying to match the dynamic route with the help of the match method but it is not working.
["payment", "portal", "animation"].each do |arg|
define_method("profile_#{arg}") do
self.path.match("/users/\d+/#{arg}")
end
end
So, this code self.path.match("/users/\d+/#{arg}") doesn't work with interpolation.
Whereas If I do something like below then it works. So, is there any way to match the data dynamically
self.path.match('/users/\d+/payment')
self.path.match('/users/\d+/portal')
self.path.match('/users/\d+/animation')

The \d+ expression doesn't work properly with doublequotes and the string interpolation. Here a sample path for "portal" and 2x ways to match it. (Tested with 2.5.5)
path = "somedir/users/#{rand(0..999)}/portal"
["payment", "portal", "animation"].each do |arg|
define_method("profile_#{arg}") do
path.match("/users/"+(/\d+/).to_s+"/#{arg}")
# or
path.match('users/\d+/'+arg)
end
end
puts profile_payment
puts profile_portal
puts profile_animation

You can interpolate within a regexp literal itself.
?> s = "payment"
=> "payment"
>> %r(/users/\d+/#{s})
=> /\/users\/\d+\/payment/

Interacting programmatically with the equivalent of puts -ruby

I have a string, that I need to save escaped and then need to interact with programmatically without any backspaces:
string = 'first=#{first_name}&last=#{last_name}'
p string.to_s
=> "first=\#{first_name}&last=\#{last_name}"
puts string.to_s
=> first=#{first_name}&last=#{last_name}
How do I get first=#{first_name}&last=#{last_name} to assign to a variable that I can scan, that does not have the "\" character?

These two are equivalent:
# double quotes
"first=\#{first_name}&last=\#{last_name}"
# single quotes
'first=#{first_name}&last=#{last_name}'
In neither case is the backslash actually part of the string. If say string.include? '\' it will return false.
However, if you were to say '\#{}' the backslash would be part of the string. That's because in single quotes, #{} does not interpolate but is interpreted as literal characters.
Some example:
foo = 1
'#{foo}' # => "\#{foo}"
"#{foo}" # => "1"
'#{foo}' == "\#{foo}" # => true
"\#{foo}".include? '\' # => false
'\#{foo}'.include? '\' # => true
Note that "\" is an invalid string in ruby, but '\' is valid.

How to find a keyword in a string

Users send in smses which must include a keyword. This keyword is then used to find a business.
The instruction is to use the keyword at the start of the sentence.
I know some users won't use the keyword at the beginning or will add tags (# # -) or punctuation (keyword.) to the keyword.
What is an efficient way to look for this keyword and for the business?
My attempt:
scrubbed_message = msg.gsub("\"", "").gsub("\'", "").gsub("#", "").gsub("-", "").gsub(",", "").gsub(".", "").gsub("#", "").split.join(" ")
tag = scrubbed_msg.split[0]
if #business = Business.where(tag: tag).first
log_message(#business)
else
scrubbed_msg.split.each do |w|
if #business = Business.where(tag: w).first
log_message(#business)
end
end
end

Instead of which characters you want to remove from the string, I suggest to use a whitelist approach specifying which characters you want to keep, for example alphanumeric characters:
sms = "#keyword and the rest"
clean_sms = sms.scan(/[\p{Alnum}]+/)
# => ["keyword", "and", "the", "rest"]
And then, if I got right what you are trying to do, to find the business you are looking for you could do something like this:
first_existing_tag = clean_sms.find do |tag|
Business.exists?(tag: tag)
end
#business = Business.where(tag: first_existing_tag).first
log_message(#business)

You can use Regexp match to filter all unnecessary characters out of the String, then use #reduce method on the Array git from splitted string to get the first occurience of a record with tag field matched to a keyword, in the exmaple: keyword, tag1, tag2:
msg = "key.w,ord tag-1'\n\"tag2"
# => "key.w,ord tag-1'\n\"tag2"
scrubbed = msg.gsub(/[#'"\-\.,#]/, "").split
# => ["keyword", "tag1", "tag2"]
#business = scrubbed.reduce(nil) do| sum, tag |
sum || Business.where(tag: tag).first
end
# => Record tag: keyword
# => Record tag: tag1 if on record with keyword found

Extracting sublink in between two characters in Ruby

How would I extract a sub-link between two characters in a string?
For example, I'd like to extract the Video ID in a youtube URL:
http://www.youtube.com/watch?v=UkzbRkPv4T4&feature=g-all-u
I'd like the text between the "=" and the first "&" sign, which would be "UkzbRkPv4T4".

If you don't want to deal with regular expressions, you could rely on functionality from Ruby's Standard Library for parsing URLs:
url = "http://www.youtube.com/watch?v=UkzbRkPv4T4&feature=g-all-u"
video_id = CGI.parse(URI.parse(url).query)['v'][0]

You just need a regular expression:
uri = 'http://www.youtube.com/watch?v=UkzbRkPv4T4&feature=g-all-u'
m = uri.match /v=(?<id>\w+)&/
if m
puts m[:id]
end

Just to expand upon apneadiving's comment.
>> url = "http://www.youtube.com/watch?v=UkzbRkPv4T4&feature=g-all-u"
=> "http://www.youtube.com/watch?v=UkzbRkPv4T4&feature=g-all-u"
>> md = url.match(/v=(.*)&/)
=> #<MatchData "v=UkzbRkPv4T4&" 1:"UkzbRkPv4T4">
>> md[1]
=> "UkzbRkPv4T4"

require 'uri'
uri = URI("http://www.youtube.com/watch?v=UkzbRkPv4T4&feature=g-all-u")
uri.query
# => "v=UkzbRkPv4T4&feature=g-all-u"
URI.decode_www_form(uri.query)
# => [["v", "UkzbRkPv4T4"], ["feature", "g-all-u"]]
URI.decode_www_form(uri.query).map(&:last)
# => ["UkzbRkPv4T4", "g-all-u"]
URI.decode_www_form(uri.query).assoc("v").last
# => "UkzbRkPv4T4"

Finding exact words in a string

I have a list of links to clothing websites that I am categorising by gender using keywords. Depending on what website they are for, they all have different URL structures, for example...
www.website1.com/shop/womens/tops/tshirt
www.website2.com/products/womens-tshirt
I cannot use the .include? method because regardless of whether it is .include?("mens") or .include?("womens"), it will return true. How can I have a method that will only return true for "womens" (and vice versa). I suspect it may have to be some sort of regex, but I am relatively inexperienced with these, and the different URL structures make it all the more tricky. Any help is much appreciated, thanks!

The canonical regex way of doing this is to search on word boundaries:
pry(main)> "foo/womens/bar".match(/\bwomens\b/)
=> #<MatchData "womens">
pry(main)> "foo/womens/bar".match(/\bmens\b/)
=> nil
pry(main)> "foo/mens/bar".match(/\bmens\b/)
=> #<MatchData "mens">
pry(main)> "foo/mens/bar".match(/\bwomens\b/)
=> nil
That said, either splitting, or searching with the leading "/", may be adequate.

If you first check for women it should work:
# assumes str is not nil
def gender(str)
if str.include?("women")
"F"
elsif str.include?("men")
"M"
else
nil
end
end
If this is not what you are looking for, please explain your problem in more detail.

You could split with / and check for string equality on the component(s) you want -- no need for a regex there

keyword = "women"
url = "www.website1.com/shop/womens/tops/tshirt"
/\/#{keyword}/ =~ url
=> 21
keyword = "men"
url = "www.website1.com/shop/womens/tops/tshirt"
/\/#{keyword}/ =~ url
=> nil
keyword = "women"
url = www.website2.com/products/womens-tshirt
/\/#{keyword}/ =~ url
=> 25
keyword = "men"
url = www.website2.com/products/womens-tshirt
/\/#{keyword}/ =~ url
=> nil
Then just do a !! on it:
=> !!nil => false
=> !!25 => true

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Regexp union in ruby escapes my original regex - ruby-on-rails

Pass Regexp object. %w(...) is string literal. Use %r(...) or /.../ for regular expression literal. regex = %r(^image\d$) # => /^image\d$/ Regexp.union(regex) # => /^image\d*$/ array_of_regexs = [/a/, /b/, /c/] # => [/a/, /b/, /c/] Regexp.union(array_of_regexs) # => /(?-mix:a)|(?-mix:b)|(?-mix:c)/

Related

Match dynamic data with match method

Interacting programmatically with the equivalent of puts -ruby

How to find a keyword in a string

Extracting sublink in between two characters in Ruby

Finding exact words in a string

Categories

Resources