Ruby regular expression for version numbers - ruby-on-rails

I want to write a program which takes build number in the format of 23.0.23.345 (first two-digits then dot, then zero, then dot, then two-digits, dot, three-digits):
number=23.0.23.345
pattern = /(^[0-9]+\.{0}\.[0-9]+\.[0-9]$)/
numbers.each do |number|
if number.match pattern
puts "#{number} matches"
else
puts "#{number} does not match"
end
end
Output:
I am getting error:
floating literal anymore put zero before dot

I'd use something like this to find patterns that match:
number = 'foo 1.2.3.4 23.0.23.345 bar'
build_number = number[/
\d{2} # two digits
\.
0
\.
\d{2} # two more digits
\.
\d{3}
/x]
build_number # => "23.0.23.345"
This example is using String's [/regex/] method, which is a nice shorthand way to apply and return the result of a regex. It returns the first match only in the form I'm using. Read the documentation for more information and examples.
Your pattern won't work because it doesn't do what you think it does. Here's how I'd read it:
/( # group
^ # start of line
[0-9]+ # one or more digits
\.{0} # *NO* dots
\. # one dot
[0-9]+ # one or more digits
\. # one dot
[0-9] # one digit
$ # end of line
)/x
The problem is \.{0} which means you don't want any dots.
The x flag tells Ruby to use multiline, which ignores blanks/whitespace and comments, making it easy to build a pattern that is documented.

Why reinvent the wheel? Use a gem like versionomy. You can parse the versions, compare them, check for equality, increment a particular part, etc. It even handles alpha, beta, patchlevels, etc.
require 'versionomy'
number='23.0.23.345'
v = Versionomy.parse number
v.major #=> 23
v.minor #=> 0
v.tiny #=> 23
v.tiny2 #=> 345

numbers = "23.0.23.345", "23.0.33.173", "0.0.0.0"
pattern = /\d{2}\.0\.\d{2}\.\d{3}/x
numbers.each do |number|
if number.match pattern
puts "#{number} matches"
else
puts "#{number} does not match"
end
end
The "number" array in line one needs to have values of strings and not integers, I also changed the array "number" to "numbers", you will also need multiple items in the numbers array to call the ".each" method in your loop.

There seems to be agreement on what regular expression you should use. If your ultimate goal is to extract the elements of the strings as integers, you could do this:
str = "I'm looking for 23.0.345.26, or was that 23.0.26.345?"
str.scan(/(\d{2})\.(0)\.(\d{2})\.(\d{3})/).flatten.map(&:to_i)
#=> [23, 0, 26, 345]

Related

Rails string split every other "."

I have a bunch of sentences that I want to break into an array. Right now, I'm splitting every time \n appears in the string.
#chapters = #script.split('\n')
What I'd like to do is .split ever OTHER "." in the string. Is that possible in Ruby?
You could do it with a regex, but I'd start with a simple approach: just split on periods, then join pairs of substrings:
s = "foo. bar foo. foo bar. boo far baz. bizzle"
s.split(".").each_slice(2).map {|p| p.join "." }
# => => ["foo. bar foo", " foo bar. boo far baz", " bizzle"]
This is a case where it's easier to use String#scan than String#split.
We can use the following regular expression:
r = /(?<=\.|\A)[^.]*\.[^.]*(?=\.|\z)/
str=<<~_
Now is the time. This is it. It is now. The time to have fun.
The time to make new friends. The time to party.
_
str.scan(r)
#=> [
# "Now is the time. This is it",
# " It is now. The time to have fun",
# "\nThe time to make new friends. The time to party"
#=> ]
We can write the regular expression in free-spacing mode to make it self-documenting.
r = /
(?<= # begin a positive lookbehind
\A # match the beginning of the string
| # or
\. # match a period
) # end positive lookbehind
[^.]* # match zero or more characters other than periods
\. # match a period
[^.]* # match zero or more characters other than periods
(?= # begin a positive lookahead
\. # match a period
| # or
\z # match the end of the string
) # end positive lookahead
/x # invoke free-spacing regex definition mode
Note that (?<=\.|\A) can be replaced with (?<![^\.]). (?<![^\.]) is a negative lookbehind that asserts the match is not preceded by a character other than a period.
Similarly, (?=\.|\z) can be replaced with (?![^.]). (?![^.]) is a negative lookahead that asserts the match is not followed by a character other than a period.

Ruby - slice all characters till underscore in a string

I have a string like this solution_10 and I would like to remove the part solution_ from it, the number after the underscore will increase, it can be 100, 1000 and even larger. I cant seem to wrap my head around on how to do this.
I have tried to use slice!(0, 9) but that gives me solution_, I then tried slice!(0, -2) but that gives me null,
I then tried using solution_10[1..9] this gives me ortable_1
So my question is how to get rid of all characters till underscore, all I want is the number after the underscore.
Use String#split method
'solution_10'.split('_').last #will return original string if no underscore present
#=> "10"
'solution_10'.split('_')[1] #will return nil if no underscore present
#=> "10"
"solution_10"[/(?<=_).*/]
#⇒ "10"
or simply just get digits until the end of the line:
"solution_10"[/\d+\z/]
#⇒ "10"
I cant seem to wrap my head around on how to do this.
First of all, slice and its shortcut [] can be used in many ways. One way is by providing a start index and a length:
'hello'[2, 3] #=> "llo" # 3 characters, starting at index 2
# ^^^
You can use that variant if you know the length in advance. But since the number part in your string could be 10 or 100 or 1000, you don't.
Another way is to provide a range, denoting the start and end index:
'hello'[2..3] #=> "ll" # substring from index 2 to index 3
# ^^
In this variant, Ruby will determine the length for you. You can also provide negative indices to count from the end. -1 is the last character, -2 the second to last and so on.
So my question is how to get rid of all characters till underscore, all I want is the number after the underscore.
We have to get the index of the underscore:
s = "solution_10"
i = s.index('_') #=> 8
Now we can get the substring from that index to the last character via:
s[i..-1] #=> "_10"
Apparently, we're off by one, so let's add 1:
s[i+1..-1] #=> "10"
There you go.
Note that this approach will not necessarily return a number (or numeric string), it will simply return everything after the first underscore:
s = 'foo_bar'
i = s.index('_') #=> 3
s[i+1..-1] #=> "bar"
It will also fail miserably if the string does not contain an underscore, because i would be nil:
s = 'foo'
i = s.index('_') #=> nil
s[i+1..-1] #=> NoMethodError: undefined method `+' for nil:NilClass
For a more robust solution, you can pass a regular expression to slice / [] as already shown in the other answers. Here's a version that matches an underscored followed by a number at the end of the string. The number part is captured and returned:
"solution_10"[/_(\d+)\z/, 1] #=> "10"
# _ literal underscore
# ( ) capture group (the `1` argument refers to this)
# \d+ one or more digits
# \z end of string
Another way:
'solution_10'[/\d+/]
#=> "10"
Why don't just make use of regex
"solution_10".scan(/\d+/).last
#=> "10"

string format check

Suppose I have string variables like following:
s1="10$"
s2="10$ I am a student"
s3="10$Good"
s4="10$ Nice weekend!"
As you see above, s2 and s4 have white space(s) after 10$ .
Generally, I would like to have a way to check if a string start with 10$ and have white-space(s) after 10$ . For example, The rule should find s2 and s4 in my above case. how to define such rule to check if a string start with '10$' and have white space(s) after?
What I mean is something like s2.RULE? should return true or false to tell if it is the matched string.
---------- update -------------------
please also tell the solution if 10# is used instead of 10$
You can do this using Regular Expressions (Ruby has Perl-style regular expressions, to be exact).
# For ease of demonstration, I've moved your strings into an array
strings = [
"10$",
"10$ I am a student",
"10$Good",
"10$ Nice weekend!"
]
p strings.find_all { |s| s =~ /\A10\$[ \t]+/ }
The regular expression breaks down like this:
The / at the beginning and the end tell Ruby that everything in between is part of the regular expression
\A matches the beginning of a string
The 10 is matched verbatim
\$ means to match a $ verbatim. We need to escape it since $ has a special meaning in regular expressions.
[ \t]+ means "match at least one blank and/or tab"
So this regular expressions says "Match every string that starts with 10$ followed by at least one blank or tab character". Using the =~ you can test strings in Ruby against this expression. =~ will return a non-nil value, which evaluates to true if used in a conditional like if.
Edit: Updated white space matching as per Asmageddon's suggestion.
this works:
"10$ " =~ /^10\$ +/
and returns either nil when false or 0 when true. Thanks to Ruby's rule, you can use it directly.
Use a regular expression like this one:
/10\$\s+/
EDIT
If you use =~ for matching, note that
The =~ operator returns the character position in the string of the
start of the match
So it might return 0 to denote a match. Only a return of nil means no match.
See for example http://www.regular-expressions.info/ruby.html on a regular expression tutorial for ruby.
If you want to proceed to cases with $ and # then try this regular expression:
/^10[\$#] +/

How to find all instances of #[XX:XXXX] in a string and then find the surrounding text?

Given a string like:
"#[19:Sara Mas] what's the latest with the TPS report? #[30:Larry Peters] can you help out here?"
I want to find a way to dynamically return, the user tagged and the content surrounding. Results should be:
user_id: 19
copy: what's the latest with the TPS report?
user_id: 30
copy: can you help out here?
Any ideas on how this can be done with ruby/rails? Thanks
How is this regex for finding matches?
#\[\d+:\w+\s\w+\]
Split the string, then handle the content iteratively. I don't think it'd take more than:
tmp = string.split('#').map {|str| [str[/\[(\d*).*/,1], str[/\](.*^)/,1]] }
tmp.first #=> ["19", "what's the latest with the TPS report?"]
Does that help?
result = subject.scan(/\[(\d+).*?\](.*?)(?=#|\Z)/m)
This grabs id and content in backreferences 1 and 2 respectively. For stoping the capture either # or the end of string must be met.
"
\\[ # Match the character “[” literally
( # Match the regular expression below and capture its match into backreference number 1
\\d # Match a single digit 0..9
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
. # Match any single character that is not a line break character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
\\] # Match the character “]” literally
( # Match the regular expression below and capture its match into backreference number 2
. # Match any single character that is not a line break character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
)
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
# Match either the regular expression below (attempting the next alternative only if this one fails)
\# # Match the character “\#” literally
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
\$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
)
"
This will match something starting from # and ending to punctuation makr. Sorry if I didn't understand correctly.
result = subject.scan(/#.*?[.?!]/)

Best way to count words in a string in Ruby?

Is there anything better than string.scan(/(\w|-)+/).size (the - is so, e.g., "one-way street" counts as 2 words instead of 3)?
string.split.size
Edited to explain multiple spaces
From the Ruby String Documentation page
split(pattern=$;, [limit]) → anArray
Divides str into substrings based on a delimiter, returning an array
of these substrings.
If pattern is a String, then its contents are used as the delimiter
when splitting str. If pattern is a single space, str is split on
whitespace, with leading whitespace and runs of contiguous whitespace
characters ignored.
If pattern is a Regexp, str is divided where the pattern matches.
Whenever the pattern matches a zero-length string, str is split into
individual characters. If pattern contains groups, the respective
matches will be returned in the array as well.
If pattern is omitted, the value of $; is used. If $; is nil (which is
the default), str is split on whitespace as if ' ' were specified.
If the limit parameter is omitted, trailing null fields are
suppressed. If limit is a positive number, at most that number of
fields will be returned (if limit is 1, the entire string is returned
as the only entry in an array). If negative, there is no limit to the
number of fields returned, and trailing null fields are not
suppressed.
" now's the time".split #=> ["now's", "the", "time"]
While that is the current version of ruby as of this edit, I learned on 1.7 (IIRC), where that also worked. I just tested it on 1.8.3.
I know this is an old question, but this might be useful to someone else looking for something more sophisticated than string.split. I wrote the words_counted gem to solve this particular problem, since defining words is pretty tricky.
The gem lets you define your own custom criteria, or use the out of the box regexp, which is pretty handy for most use cases. You can pre-filter words with a variety of options, including a string, lambda, array, or another regexp.
counter = WordsCounted::Counter.new("Hello, Renée! 123")
counter.word_count #=> 2
counter.words #=> ["Hello", "Renée"]
# filter the word "hello"
counter = WordsCounted::Counter.new("Hello, Renée!", reject: "Hello")
counter.word_count #=> 1
counter.words #=> ["Renée"]
# Count numbers only
counter = WordsCounted::Counter.new("Hello, Renée! 123", rexexp: /[0-9]/)
counter.word_count #=> 1
counter.words #=> ["123"]
The gem provides a bunch more useful methods.
If the 'word' in this case can be described as an alphanumeric sequence which can include '-' then the following solution may be appropriate (assuming that everything that doesn't match the 'word' pattern is a separator):
>> 'one-way street'.split(/[^-a-zA-Z]/).size
=> 2
>> 'one-way street'.split(/[^-a-zA-Z]/).each { |m| puts m }
one-way
street
=> ["one-way", "street"]
However, there are some other symbols that can be included in the regex - for example, ' to support the words like "it's".
This is pretty simplistic but does the job if you are typing words with spaces in between. It ends up counting numbers as well but I'm sure you could edit the code to not count numbers.
puts "enter a sentence to find its word length: "
word = gets
word = word.chomp
splits = word.split(" ")
target = splits.length.to_s
puts "your sentence is " + target + " words long"
The best way to do is to use split method.
split divides a string into sub-strings based on a delimiter, returning an array of the sub-strings.
split takes two parameters, namely; pattern and limit.
pattern is the delimiter over which the string is to be split into an array.
limit specifies the number of elements in the resulting array.
For more details, refer to Ruby Documentation: Ruby String documentation
str = "This is a string"
str.split(' ').size
#output: 4
The above code splits the string wherever it finds a space and hence it give the number of words in the string which is indirectly the size of the array.
The above solution is wrong, consider the following:
"one-way street"
You will get
["one-way","", "street"]
Use
'one-way street'.gsub(/[^-a-zA-Z]/, ' ').split.size
This splits words only on ASCII whitespace chars:
p " some word\nother\tword|word".strip.split(/\s+/).size #=> 4

Resources