Ruby - slice all characters till underscore in a string - ruby-on-rails

I have a string like this solution_10 and I would like to remove the part solution_ from it, the number after the underscore will increase, it can be 100, 1000 and even larger. I cant seem to wrap my head around on how to do this.
I have tried to use slice!(0, 9) but that gives me solution_, I then tried slice!(0, -2) but that gives me null,
I then tried using solution_10[1..9] this gives me ortable_1
So my question is how to get rid of all characters till underscore, all I want is the number after the underscore.

Use String#split method
'solution_10'.split('_').last #will return original string if no underscore present
#=> "10"
'solution_10'.split('_')[1] #will return nil if no underscore present
#=> "10"

"solution_10"[/(?<=_).*/]
#⇒ "10"
or simply just get digits until the end of the line:
"solution_10"[/\d+\z/]
#⇒ "10"

I cant seem to wrap my head around on how to do this.
First of all, slice and its shortcut [] can be used in many ways. One way is by providing a start index and a length:
'hello'[2, 3] #=> "llo" # 3 characters, starting at index 2
# ^^^
You can use that variant if you know the length in advance. But since the number part in your string could be 10 or 100 or 1000, you don't.
Another way is to provide a range, denoting the start and end index:
'hello'[2..3] #=> "ll" # substring from index 2 to index 3
# ^^
In this variant, Ruby will determine the length for you. You can also provide negative indices to count from the end. -1 is the last character, -2 the second to last and so on.
So my question is how to get rid of all characters till underscore, all I want is the number after the underscore.
We have to get the index of the underscore:
s = "solution_10"
i = s.index('_') #=> 8
Now we can get the substring from that index to the last character via:
s[i..-1] #=> "_10"
Apparently, we're off by one, so let's add 1:
s[i+1..-1] #=> "10"
There you go.
Note that this approach will not necessarily return a number (or numeric string), it will simply return everything after the first underscore:
s = 'foo_bar'
i = s.index('_') #=> 3
s[i+1..-1] #=> "bar"
It will also fail miserably if the string does not contain an underscore, because i would be nil:
s = 'foo'
i = s.index('_') #=> nil
s[i+1..-1] #=> NoMethodError: undefined method `+' for nil:NilClass
For a more robust solution, you can pass a regular expression to slice / [] as already shown in the other answers. Here's a version that matches an underscored followed by a number at the end of the string. The number part is captured and returned:
"solution_10"[/_(\d+)\z/, 1] #=> "10"
# _ literal underscore
# ( ) capture group (the `1` argument refers to this)
# \d+ one or more digits
# \z end of string

Another way:
'solution_10'[/\d+/]
#=> "10"

Why don't just make use of regex
"solution_10".scan(/\d+/).last
#=> "10"

Related

How to find last occurrence of a substring in a given string?

I have a string, which describe some word, I must change ending of it to "sd", if ending == "jk".
For an example, I have word: "lazerjk", I need to get from it "lazersd".
I tried to use method .gsub!, but it doesn't work correctly if we have more than one occurrence of substring "jk" in a word.
String#rindex returns the index of the last occurrence of the given substring
String#[]= can take two integers arguments, first is index where start to replace and second - length of replaced string
You can use them this way:
replaced = "foo"
replacing = "booo"
string = "foo bar foo baz"
string[string.rindex(replaced), replaced.size] = replacing
string
# => "foo bar booo baz"
"jughjkjkjk\njk".sub(/jk$\z/, 'sd')
=> "jughjkjkjk\nsd"
without $ is probably sufficient.
It sounds like you're looking to replace a specific suffix only. If so, I would probably suggest using sub along with an anchored regex (to check for the desired characters only at the end of the string):
string_1 = "lazerjk"
string_2 = "lazerjk\njk"
string_3 = "lazerjkr"
string_1.sub(/jk\z/, "sd")
#=> "lazersd"
string_2.sub(/jk\z/, "sd")
#=> "lazerjk\nsd"
string_3.sub(/jk\z/, "sd")
#=> "lazerjkr"
Or, you could do without a regex at all by using the reverse! method along with a simple conditional statement to sub! only when the suffix is present:
string = "lazerjk"
old_suffix = "jk"
new_suffix = "sd"
string.reverse!.sub!(old_suffix.reverse, new_suffix.reverse).reverse! if string.end_with? (old_suffix)
string
#=> "lazersd"
OR, you could even use a completely different approach. Here's an example using chomp to remove the unwanted suffix and then ljust to pad the desired suffix to the modified string.
string = "lazerjk"
string.chomp("jk").ljust(string.length, "sd")
#=> "lazersd"
Note that the new suffix only gets added if the length of the string was modified with the initial chomp. Otherwise, the string remains unchanged.
If the goal is to substitute the LAST OCCURRENCE (as opposed to suffix only), then this could be accomplished by using sub along with reverse:
string = "jklazerjkm"
old_substring = "jk"
new_substring = "sd"
string.reverse.sub(old_substring.reverse, new_substring.reverse).reverse
#=> "jklazersdm"
Replacing "jk" at the end of a string with something else is straightforward and can be addressed without concern for other instances of "jk" that may be in the string, so I assume that is not what is being asked. Rather, I assume the problem is to replace the last instance of "jk" in a string with "sd".
Here are two solutions that make use of String#sub with a regular expression.
Use a negative lookahead
The idea here is to match "jk" provided it is not followed later in the string by another instance of "jk".
"lajkz\nejkrjklm".sub(/jk(?!.*jk)/m, "sd")
#=> "lajkz\nejkrsdlm"
Capture the part of the string that precedes the last "jk"
The match, if there is one, consists of the front of the string followed by the last "jk", which is replaced by the captured string followed by "sd".
"lajkz\nejkrjklm".sub(/\A(.*)jk/m) { $1 + "sd" }
#=> "lajkz\nejkrsdlm"
The two regular expressions can be written in free-spacing mode to make them self-documenting. The first is the following.
/
jk # match literal
(?! # begin a negative lookahead
.* # match zero or more characters other than line terminators
jk # match literal
) # end negative lookahead
/mx # invoke multiline and free-spacing regex definition modes.
Multiline mode causes . to match any character, including a line terminator.
The second regular expression can be written as follows.
\A # match the beginning of the string
(.*) # match zero or more characters other than line terminators
# and save the match to capture group 1
jk # match literal
/mx # invoke multiline and free-spacing regex definition modes.
Note that in both expressions .* is greedy, meaning that it will match as many characters as possible, including "jk" so long as other requirements of the expression are met, here that the last instance of "jk" in the string is matched.
Here is a different solution:
str = "jughjkjkjk\njk"
pattern = "jk"
replace_with = "sd"
str = str.reverse.sub(pattern.reverse, replace_with.reverse).reverse

simpler way to modify a string

I recently solved this problem, but felt there is a simpler way to do it. I'd like to use fewer lines of code than I am now. I'm new to ruby so if the answer is simple I'd love to add it to my toolbag. Thank you in advance.
goal: accept a word as an arg, and return the word with it's last vowel removed, if no vowels - return the original word
def hipsterfy(word)
vowels = "aeiou"
i = word.length - 1
while i >= 0
if vowels.include?(word[i])
return word[0...i] + word[i+1..-1]
end
i -= 1
end
word
end
try this regex magic:
def hipsterfy(word)
word.gsub(/[aeiou](?=[^aeiou]*$)/, "")
end
how does it work?
[aeiou] looks for a vowel., and ?=[^aeiou]*$ adds the constraint "where there is no vowel match in the following string. So the regex finds the last vowel. Then we just gsub the matched (last vowel) with "".
You could use rindex to find the last vowel's index and []= to remove the corresponding character:
def hipsterfy(word)
idx = word.rindex(/[aoiou]/)
word[idx] = '' if idx
word
end
The if idx is needed because rindex returns nil if no vowel is found. Note that []= modifies word.
There's also rpartition which splits the string at the given pattern, returning an array containing the part before, the match and the part after. By concat-enating the former and latter, you can effectively remove the middle part: (i.e. the vowel)
def hipsterfy(word)
before, _, after = word.rpartition(/[aoiou]/)
before.concat(after)
end
This variant returns a new string, leaving word unchanged.
Another common approach when dealing with some last occurrence is to reverse the string so you can deal with a first occurrence instead (which is usually simpler). Here, you can utilize sub:
def hipsterfy(word)
word.reverse.sub(/[aeiou]/, '').reverse
end
Here is another way to do it.
Reverse the characters of the string
Use find_index to get the first vowel location in this reversed string
Delete the character at this index
Un-reverse the characters and join them back together.
reverse_chars = str.chars.reverse
vowel_idx = reverse_chars.find_index { |char| char =~ /[aeiou]/ }
reverse_chars.delete_at(vowel_idx) if vowel_idx
result = reverse_chars.reverse.join

Ruby regular expression for version numbers

I want to write a program which takes build number in the format of 23.0.23.345 (first two-digits then dot, then zero, then dot, then two-digits, dot, three-digits):
number=23.0.23.345
pattern = /(^[0-9]+\.{0}\.[0-9]+\.[0-9]$)/
numbers.each do |number|
if number.match pattern
puts "#{number} matches"
else
puts "#{number} does not match"
end
end
Output:
I am getting error:
floating literal anymore put zero before dot
I'd use something like this to find patterns that match:
number = 'foo 1.2.3.4 23.0.23.345 bar'
build_number = number[/
\d{2} # two digits
\.
0
\.
\d{2} # two more digits
\.
\d{3}
/x]
build_number # => "23.0.23.345"
This example is using String's [/regex/] method, which is a nice shorthand way to apply and return the result of a regex. It returns the first match only in the form I'm using. Read the documentation for more information and examples.
Your pattern won't work because it doesn't do what you think it does. Here's how I'd read it:
/( # group
^ # start of line
[0-9]+ # one or more digits
\.{0} # *NO* dots
\. # one dot
[0-9]+ # one or more digits
\. # one dot
[0-9] # one digit
$ # end of line
)/x
The problem is \.{0} which means you don't want any dots.
The x flag tells Ruby to use multiline, which ignores blanks/whitespace and comments, making it easy to build a pattern that is documented.
Why reinvent the wheel? Use a gem like versionomy. You can parse the versions, compare them, check for equality, increment a particular part, etc. It even handles alpha, beta, patchlevels, etc.
require 'versionomy'
number='23.0.23.345'
v = Versionomy.parse number
v.major #=> 23
v.minor #=> 0
v.tiny #=> 23
v.tiny2 #=> 345
numbers = "23.0.23.345", "23.0.33.173", "0.0.0.0"
pattern = /\d{2}\.0\.\d{2}\.\d{3}/x
numbers.each do |number|
if number.match pattern
puts "#{number} matches"
else
puts "#{number} does not match"
end
end
The "number" array in line one needs to have values of strings and not integers, I also changed the array "number" to "numbers", you will also need multiple items in the numbers array to call the ".each" method in your loop.
There seems to be agreement on what regular expression you should use. If your ultimate goal is to extract the elements of the strings as integers, you could do this:
str = "I'm looking for 23.0.345.26, or was that 23.0.26.345?"
str.scan(/(\d{2})\.(0)\.(\d{2})\.(\d{3})/).flatten.map(&:to_i)
#=> [23, 0, 26, 345]

How to find all instances of #[XX:XXXX] in a string and then find the surrounding text?

Given a string like:
"#[19:Sara Mas] what's the latest with the TPS report? #[30:Larry Peters] can you help out here?"
I want to find a way to dynamically return, the user tagged and the content surrounding. Results should be:
user_id: 19
copy: what's the latest with the TPS report?
user_id: 30
copy: can you help out here?
Any ideas on how this can be done with ruby/rails? Thanks
How is this regex for finding matches?
#\[\d+:\w+\s\w+\]
Split the string, then handle the content iteratively. I don't think it'd take more than:
tmp = string.split('#').map {|str| [str[/\[(\d*).*/,1], str[/\](.*^)/,1]] }
tmp.first #=> ["19", "what's the latest with the TPS report?"]
Does that help?
result = subject.scan(/\[(\d+).*?\](.*?)(?=#|\Z)/m)
This grabs id and content in backreferences 1 and 2 respectively. For stoping the capture either # or the end of string must be met.
"
\\[ # Match the character “[” literally
( # Match the regular expression below and capture its match into backreference number 1
\\d # Match a single digit 0..9
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
. # Match any single character that is not a line break character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
\\] # Match the character “]” literally
( # Match the regular expression below and capture its match into backreference number 2
. # Match any single character that is not a line break character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
)
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
# Match either the regular expression below (attempting the next alternative only if this one fails)
\# # Match the character “\#” literally
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
\$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
)
"
This will match something starting from # and ending to punctuation makr. Sorry if I didn't understand correctly.
result = subject.scan(/#.*?[.?!]/)

Best way to count words in a string in Ruby?

Is there anything better than string.scan(/(\w|-)+/).size (the - is so, e.g., "one-way street" counts as 2 words instead of 3)?
string.split.size
Edited to explain multiple spaces
From the Ruby String Documentation page
split(pattern=$;, [limit]) → anArray
Divides str into substrings based on a delimiter, returning an array
of these substrings.
If pattern is a String, then its contents are used as the delimiter
when splitting str. If pattern is a single space, str is split on
whitespace, with leading whitespace and runs of contiguous whitespace
characters ignored.
If pattern is a Regexp, str is divided where the pattern matches.
Whenever the pattern matches a zero-length string, str is split into
individual characters. If pattern contains groups, the respective
matches will be returned in the array as well.
If pattern is omitted, the value of $; is used. If $; is nil (which is
the default), str is split on whitespace as if ' ' were specified.
If the limit parameter is omitted, trailing null fields are
suppressed. If limit is a positive number, at most that number of
fields will be returned (if limit is 1, the entire string is returned
as the only entry in an array). If negative, there is no limit to the
number of fields returned, and trailing null fields are not
suppressed.
" now's the time".split #=> ["now's", "the", "time"]
While that is the current version of ruby as of this edit, I learned on 1.7 (IIRC), where that also worked. I just tested it on 1.8.3.
I know this is an old question, but this might be useful to someone else looking for something more sophisticated than string.split. I wrote the words_counted gem to solve this particular problem, since defining words is pretty tricky.
The gem lets you define your own custom criteria, or use the out of the box regexp, which is pretty handy for most use cases. You can pre-filter words with a variety of options, including a string, lambda, array, or another regexp.
counter = WordsCounted::Counter.new("Hello, Renée! 123")
counter.word_count #=> 2
counter.words #=> ["Hello", "Renée"]
# filter the word "hello"
counter = WordsCounted::Counter.new("Hello, Renée!", reject: "Hello")
counter.word_count #=> 1
counter.words #=> ["Renée"]
# Count numbers only
counter = WordsCounted::Counter.new("Hello, Renée! 123", rexexp: /[0-9]/)
counter.word_count #=> 1
counter.words #=> ["123"]
The gem provides a bunch more useful methods.
If the 'word' in this case can be described as an alphanumeric sequence which can include '-' then the following solution may be appropriate (assuming that everything that doesn't match the 'word' pattern is a separator):
>> 'one-way street'.split(/[^-a-zA-Z]/).size
=> 2
>> 'one-way street'.split(/[^-a-zA-Z]/).each { |m| puts m }
one-way
street
=> ["one-way", "street"]
However, there are some other symbols that can be included in the regex - for example, ' to support the words like "it's".
This is pretty simplistic but does the job if you are typing words with spaces in between. It ends up counting numbers as well but I'm sure you could edit the code to not count numbers.
puts "enter a sentence to find its word length: "
word = gets
word = word.chomp
splits = word.split(" ")
target = splits.length.to_s
puts "your sentence is " + target + " words long"
The best way to do is to use split method.
split divides a string into sub-strings based on a delimiter, returning an array of the sub-strings.
split takes two parameters, namely; pattern and limit.
pattern is the delimiter over which the string is to be split into an array.
limit specifies the number of elements in the resulting array.
For more details, refer to Ruby Documentation: Ruby String documentation
str = "This is a string"
str.split(' ').size
#output: 4
The above code splits the string wherever it finds a space and hence it give the number of words in the string which is indirectly the size of the array.
The above solution is wrong, consider the following:
"one-way street"
You will get
["one-way","", "street"]
Use
'one-way street'.gsub(/[^-a-zA-Z]/, ' ').split.size
This splits words only on ASCII whitespace chars:
p " some word\nother\tword|word".strip.split(/\s+/).size #=> 4

Resources