I'm trying to create a regex to match only index urls (with or without parameters) in rails.
The following three match what I expect:
regex = /^http:\/\/localhost:3000\/v2\/manufacturers\/?(\S+)?$/
regex.match?('http://localhost:3000/v2/manufacturers?enabled=true')
#=> true
regex.match?('http://localhost:3000/v2/manufacturers/')
#=> true
regex.match?('http://localhost:3000/v2/manufacturers')
#=> true
I expect the regex not to match these:
regex.match?('http://localhost:3000/v2/manufacturers/1')
#=> true
regex.match?('http://localhost:3000/v2/manufacturers/123')
#=> true
regex.match?('http://localhost:3000/v2/manufacturers/1?enabled=true')
#=> true
Edit:
I'm so sorry but I forgot to mention that it should match:
regex.match?('http://localhost:3000/v2/manufacturers/1/models')
as it is a valid index url
You may use
/\Ahttp:\/\/localhost:3000\/v2\/manufacturers(?:\/?(?:\?\S+)?|\/1\/models\/?)?\z/
See the Rubular demo
Pattern details
\A - start of string
http:\/\/localhost:3000\/v2\/manufacturers - a http://localhost:3000/v2/manufacturers string
(?:\/?(?:\?\S+)?|\/1\/models)? - an optional sequence of:
\/? - an optional / char
(?:\?\S+)? - an optional sequence of ? and 1+ non-whitespace
| - or
\/1\/models\/? - /1/models string and an optional / at the end
\z - end of string.
You could change the end of your regex from:
\/?(\S+)?$/
to:
\/?(?:\?\S+|\d+\/\S+)?$
That would create an optional noncapturing group (?:\?\S+|\d+\/\S+)?.
Match \?\S+ for your questionmark and non whitespace chars
or |
Match \d+\/\S+for the added case of 1/models
Demo
The ? character makes the character optional
This works for me: http:\/\/localhost:3000\/v2\/manufacturers?\/?
r = /http://localhost:3000/v2/manufacturers?/?$/
r.match('http://localhost:3000/v2/manufacturers/1?enabled=true')
=> nil
r.match('http://localhost:3000/v2/manufacturers/1')
=> nil
Related
I'm trying to write a regular expression in Ruby where I want to see if the string contains a certain word (e.g. "string"), followed by a url and link name in parenthesis.
Right now I'm doing:
string.include?("string") && string.scan(/\(([^\)]+)\)/).present?
My input in both conditionals is a string. In the first one, I'm checking if it contains the word "link" and then I will have the link and link_name in parenthesis, like this:
"Please go to link( url link_name)"
After validating that, I extract the HTML link.
Is there a way I can combine them using regular expressions?
The most important improvement you can make is to also test that the word and the parentheseses have the correct relationship. If I understand correctly, "link(url link_name)" should be a match but "(url link_name)link" or "link stuff (url link_name)" should not. So match "link", the parentheses, and their contents, and capture the contents, all at once:
"stuff link(url link_name) more stuff".match(/link\((\S+?) (\S+?)\)/)&.captures
=> ["url", "link_name"]
(&. is Ruby 2.3; use Rails' .try :captures in older versions.)
Side note: string.scan(regex).present? is more concisely written as string =~ regex.
Checking If a Word Is Contained
If you want to find matches that contain a specific word somewhere in the string, you can accomplish this through a lookahead :
# This will match any string that contains your string "{your-string-here}"
(?=.*({your-string-here}).*).*
You could consider building a string version of your expression and passing the word you are looking for using a variable :
wordToFind = "link"
if stringToTest =~ /(?=.*(#{wordToFind}).*).*/
# stringToTest contains "link"
else
# stringToTest does not contain "link"
end
Checking for a Word AND Parentheses
If you also wanted to ensure that somewhere in your string you had a set of parentheses with some content in them and your previous lookahead for a word, you could use :
# This will match any strings that contain your word and contain a set of parentheses
(?=.*({your-string-here}).*).*\([^\)]+\).*
which might be used as :
wordToFind = "link"
if stringToTest =~ /(?=.*(#{wordToFind}).*).*\([^\)]+\).*/
# stringToTest contains "link" and some non-empty parentheses
else
# stringToTest does not contain "link" or non-empty parentheses
end
def has_both?(str, word)
str.scan(/\b#{word}\b|(?<=\()[^\(\)]+(?=\))/).size == 2
end
has_both?("Wait for me, Wild Bill.", "Bill")
#=> false
has_both?("Wait (for me), Wild William.", "Bill")
#=> false
has_both?("Wait (for me), Wild Billy.", "Bill")
#=> false
has_both?("Wait (for me), Wild bill.", "Bill")
#=> false
has_both?("Wait (for me, Wild Bill.", "Bill")
#=> false
has_both?("Wait (for me), Wild Bill.", "Bill")
#=> true
has_both?("Wait ((for me), Wild Bill.", "Bill")
#=> true
has_both?("Wait ((for me)), Wild Bill.", "Bill")
#=> true
These are the calculations for
word = "Bill"
str = "Wait (for me), Wild Bill."
r = /
\b#{word}\b # match the value of the variable 'word' with word breaks for and aft
| # or
(?<=\() # match a left paren in a positive lookbehind
[^\(\)]+ # match one or more characters other than parens
(?=\)) # match a right paren in a positive lookahead
/x # free-spacing regex definition mode
#=> /
\bBill\b # match the value of the variable 'word' with word breaks for and aft
| # or
(?<=\() # match a left paren in a positive lookbehind
[^\(\)]+ # match one or more characters other than parens
(?=\)) # match a right paren in a positive lookahead
/x
arr = str.scan(r)
#=> ["for me", "Bill"]
arr.size == 2
#=> true
I would go with something like this regex:
/link\s*\(([^\)\s]+)\s*([^\)]+)?\)/i
This will find any match starting with the word link, followed by any number of spaces, then a url followed by a link name, both in parentheses. In this regex, the link name is optional, but the url is not. The matching is case-insensitive, so it will match link and LINK exactly the same.
You can use the Regexp#match method to compare the regex to a string, and check the result for matches and captures, like so:
m = /link\s*\(([^\)\s]+)\s*([^\)]+)?\)/i.match("link (stackoverflow.com StackOverflow)")
if m # the match array is not nil
puts "Matched: #{m[0]}"
puts " -- url: {m[1]}"
puts " -- link-name: #{m[2] || 'none'}"
else # the match array is nil, so no match was found
puts "No match found"
end
If you'd like to use different strings to identify the match, you can use a non-capturing group, where you change link to something like:
(?:link|site|website|url)
In this case, the (?: syntax says not to capture this part of the match. If you want to capture which term matched, simply change that from (?: to (, and adjust the capture indexes by 1 to account for the new capture value.
Here's a short Ruby test program:
data = [
[ true, "link (http://google.com Google)", "http://google.com", "Google" ],
[ true, "LiNk(ftp://website.org)", "ftp://website.org", nil ],
[ true, "link (https://facebook.com/realstanlee/ Stan Lee) linkety link", "https://facebook.com/realstanlee/", "Stan Lee" ],
[ true, "x link (https://mail.yahoo.com Yahoo! Mail)", "https://mail.yahoo.com", "Yahoo! Mail" ],
[ false, "link lunk (http://www.com)", nil, nil ]
]
data.each do |test_case|
link = /link\s*\(([^\)\s]+)\s*([^\)]+)?\)/i.match(test_case[1])
url = link ? link[1] : nil
link_name = link ? link[2] : nil
success = test_case[0] == !link.nil? && test_case[2] == url && test_case[3] == link_name
puts "#{success ? 'Pass' : 'Fail'}: '#{test_case[1]}' #{link ? 'found' : 'not found'}"
if success && link
puts " -- url: '#{url}' link_name: '#{link_name || '(no link name)'}'"
end
end
This produces the following output:
Pass: 'link (http://google.com Google)' found
-- url: 'http://google.com' link_name: 'Google'
Pass: 'LiNk(ftp://website.org)' found
-- url: 'ftp://website.org' link_name: '(no link name)'
Pass: 'link (https://facebook.com/realstanlee/ Stan Lee) linkety link' found
-- url: 'https://facebook.com/realstanlee/' link_name: 'Stan Lee'
Pass: 'x link (https://mail.yahoo.com Yahoo! Mail)' found
-- url: 'https://mail.yahoo.com' link_name: 'Yahoo! Mail'
Pass: 'link lunk (http://www.com)' not found
If you want to allow anything other than spaces between the word 'link' and the first paren, simply change the \s* to [^\(]* and you should be good to go.
I need to verify that a string has at least one comma but not more than 4 commas.
This is what I've tried:
/,{1,4}/
/,\s{1,4}/
Neither of those work.
Note: I've been testing my RegEx's on Rubular
Any help is greatly appreciated.
Note: I'm using this in the context of an Active Record Validation:
validates :my_string, format: { with: /,\s{1,4}/}
How can do this as an Active Record Validation?
Does it have to be a regex? If not, use Ruby's count method:
> "a,a,a,a,a".count(',')
=> 4
str ="a,b,a,,"
p str.count(",").between?(1, 4) # => true
I too would suggest using count, but to address your specific question, you could do it thusly:
r = /^(?:[^,]*,){1,4}[^,]*$/
!!"eenee"[r]
#=> false
!!"eenee, meenee"[r]
#=> true
!!"eenee, meenee, minee, mo"[r]
#=> true
!!"eenee, meenee, minee, mo, oh, no!"[r]
#=> false
(?:[^,]*,) is a non-capture group that matches any string of characters other than a comma, followed by a comma;
{1,4} ensures that the non-capture group is matched between 1 and 4 times;
the anchor ^ ensures there is no comma before the first non-capture group; and
[^,]*$ ensures there is no comma after the last non-capture group.
I have a string and I need to check whether the last character of that string is *, and if it is, I need to remove it.
if stringvariable.include? "*"
newstring = stringvariable.gsub(/[*]/, '')
end
The above does not search if the '*' symbol is the LAST character of the string.
How do i check if the last character is '*'?
Thanks for any suggestion
Use the $ anchor to only match the end of line:
"sample*".gsub(/\*$/, '')
If there's the possibility of there being more than one * on the end of the string (and you want to replace them all) use:
"sample**".gsub(/\*+$/, '')
You can also use chomp (see it on API Dock), which removes the trailing record separator character(s) by default, but can also take an argument, and then it will remove the end of the string only if it matches the specified character(s).
"hello".chomp #=> "hello"
"hello\n".chomp #=> "hello"
"hello\r\n".chomp #=> "hello"
"hello\n\r".chomp #=> "hello\n"
"hello\r".chomp #=> "hello"
"hello \n there".chomp #=> "hello \n there"
"hello".chomp("llo") #=> "he"
"hello*".chomp("*") #=> "hello"
String has an end_with? method
stringvariable.chop! if stringvariable.end_with? '*'
You can do the following which will remove the offending character, if present. Otherwise it will do nothing:
your_string.sub(/\*$/, '')
If you want to remove more than one occurrence of the character, you can do:
your_string.sub(/\*+$/, '')
Of course, if you want to modify the string in-place, use sub! instead of sub
Cheers,
Aaron
You can either use a regex or just splice the string:
if string_variable[-1] == '*'
new_string = string_variable.gsub(/[\*]/, '') # note the escaped *
end
That only works in Ruby 1.9.x...
Otherwise you'll need to use a regex:
if string_variable =~ /\*$/
new_string = string_variable.gsub(/[\*]/, '') # note the escaped *
end
But you don't even need the if:
new_string = string_variable.gsub(/\*$/, '')
How do you say not to match in Ruby Regex
ex. you do not want to return true if it sees 'error'
/\b(error)\b/i
I know this returns true when it sees error, how do you say 'not' in this case? thanks!
Use the proper Ruby operator:
/\b(error)\b/i !~ someText
I would do something like the following excuse the /error/ pattern not sure exactly what you want to match here
return true unless b =~ /error/
or
return true if b !~ /error/
unless (place =~ /^\./) == 0
I know the unless is like if not but what about the condtional?
=~ means matches regex
/^\./ is a regular expression:
/.../ are the delimiters for the regex
^ matches the start of the string or of a line (\A matches the start of the string only)
\. matches a literal .
It checks if the string place starts with a period ..
Consider this:
p ('.foo' =~ /^\./) == 0 # => true
p ('foo' =~ /^\./) == 0 # => false
In this case, it wouldn't be necessary to use == 0. place =~ /^\./ would suffice as a condition:
p '.foo' =~ /^\./ # => 0 # 0 evaluates to true in Ruby conditions
p 'foo' =~ /^\./ # => nil
EDIT: /^\./ is a regular expression. The start and end slashes denotes that it is a regular expression, leaving the important bit to ^\.. The first character, ^ marks "start of string/line" and \. is the literal character ., as the dot character is normally considered a special character in regular expressions.
To read more about regular expressions, see Wikipedia or the excellent regular-expressions.info website.