For example I have some random string:
str = "26723462345"
And I want to split it in 2 parts after 6-th char. How to do this correctly?
Thank you!
This should do it
[str[0..5], str[6..-1]]
or
[str.slice(0..5), str.slice(6..-1)]
Really should check out http://corelib.rubyonrails.org/classes/String.html
Here’s on option. Be aware, however, that it will mutate your original string:
part1, part2 = str.slice!(0...6), str
p part1 # => "267234"
p part2 # => "62345"
p str # => "62345"
Update
In the years since I wrote this answer I’ve come to agree with the commenters complaining that it might be excessively clever. Below are a few other options that don’t mutate the original string.
Caveat: This one will only work with ASCII characters.
str.unpack("a6a*")
# => ["267234", "62345"]
The next one uses the magic variable $', which returns the part of the string after the most recent Regexp match:
part1, part2 = str[/.{6}/], $'
p [part1, part2]
# => ["267234", "62345"]
And this last one uses a lookbehind to split the string in the right place without returning any extra parts:
p str.split(/(?<=^.{6})/)
# => ["267234", "62345"]
The best way IMO is string.scan(/.{6}/)
irb(main)> str
=> "abcdefghijklmnopqrstuvwxyz"
irb(main)> str.scan(/.{13}/)
=> ["abcdefghijklm", "nopqrstuvwxyz"]
_, part1, part2 = str.partition /.{6}/
https://ruby-doc.org/core-1.9.3/String.html#method-i-partition
As a fun answer, how about:
str.split(/(^.{1,6})/)[1..-1]
This works because split returns the capture group matches, in addition to the parts of the string before and after the regular expression.
Here's a reusable version for you:
str = "26723462345"
n = str.length
boundary = 6
head = str.slice(0, boundary) # => "267234"
tail = str.slice(boundary, n) # => "62345"
It also preserves the original string, which may come in handy later in the program.
Related
I have a string, which describe some word, I must change ending of it to "sd", if ending == "jk".
For an example, I have word: "lazerjk", I need to get from it "lazersd".
I tried to use method .gsub!, but it doesn't work correctly if we have more than one occurrence of substring "jk" in a word.
String#rindex returns the index of the last occurrence of the given substring
String#[]= can take two integers arguments, first is index where start to replace and second - length of replaced string
You can use them this way:
replaced = "foo"
replacing = "booo"
string = "foo bar foo baz"
string[string.rindex(replaced), replaced.size] = replacing
string
# => "foo bar booo baz"
"jughjkjkjk\njk".sub(/jk$\z/, 'sd')
=> "jughjkjkjk\nsd"
without $ is probably sufficient.
It sounds like you're looking to replace a specific suffix only. If so, I would probably suggest using sub along with an anchored regex (to check for the desired characters only at the end of the string):
string_1 = "lazerjk"
string_2 = "lazerjk\njk"
string_3 = "lazerjkr"
string_1.sub(/jk\z/, "sd")
#=> "lazersd"
string_2.sub(/jk\z/, "sd")
#=> "lazerjk\nsd"
string_3.sub(/jk\z/, "sd")
#=> "lazerjkr"
Or, you could do without a regex at all by using the reverse! method along with a simple conditional statement to sub! only when the suffix is present:
string = "lazerjk"
old_suffix = "jk"
new_suffix = "sd"
string.reverse!.sub!(old_suffix.reverse, new_suffix.reverse).reverse! if string.end_with? (old_suffix)
string
#=> "lazersd"
OR, you could even use a completely different approach. Here's an example using chomp to remove the unwanted suffix and then ljust to pad the desired suffix to the modified string.
string = "lazerjk"
string.chomp("jk").ljust(string.length, "sd")
#=> "lazersd"
Note that the new suffix only gets added if the length of the string was modified with the initial chomp. Otherwise, the string remains unchanged.
If the goal is to substitute the LAST OCCURRENCE (as opposed to suffix only), then this could be accomplished by using sub along with reverse:
string = "jklazerjkm"
old_substring = "jk"
new_substring = "sd"
string.reverse.sub(old_substring.reverse, new_substring.reverse).reverse
#=> "jklazersdm"
Replacing "jk" at the end of a string with something else is straightforward and can be addressed without concern for other instances of "jk" that may be in the string, so I assume that is not what is being asked. Rather, I assume the problem is to replace the last instance of "jk" in a string with "sd".
Here are two solutions that make use of String#sub with a regular expression.
Use a negative lookahead
The idea here is to match "jk" provided it is not followed later in the string by another instance of "jk".
"lajkz\nejkrjklm".sub(/jk(?!.*jk)/m, "sd")
#=> "lajkz\nejkrsdlm"
Capture the part of the string that precedes the last "jk"
The match, if there is one, consists of the front of the string followed by the last "jk", which is replaced by the captured string followed by "sd".
"lajkz\nejkrjklm".sub(/\A(.*)jk/m) { $1 + "sd" }
#=> "lajkz\nejkrsdlm"
The two regular expressions can be written in free-spacing mode to make them self-documenting. The first is the following.
/
jk # match literal
(?! # begin a negative lookahead
.* # match zero or more characters other than line terminators
jk # match literal
) # end negative lookahead
/mx # invoke multiline and free-spacing regex definition modes.
Multiline mode causes . to match any character, including a line terminator.
The second regular expression can be written as follows.
\A # match the beginning of the string
(.*) # match zero or more characters other than line terminators
# and save the match to capture group 1
jk # match literal
/mx # invoke multiline and free-spacing regex definition modes.
Note that in both expressions .* is greedy, meaning that it will match as many characters as possible, including "jk" so long as other requirements of the expression are met, here that the last instance of "jk" in the string is matched.
Here is a different solution:
str = "jughjkjkjk\njk"
pattern = "jk"
replace_with = "sd"
str = str.reverse.sub(pattern.reverse, replace_with.reverse).reverse
I am building a Rails 5.2 app.
In this app I got outputs from different suppliers (I am building a webshop).
The name of the shipping provider is in this format:
dhl_freight__233433
It could also be in this format:
postal__US-320202
How can I remove all that is before (and including) the __ so all that remains are the things after the ___ like for example 233433.
Perhaps some sort of RegEx.
A very simple approach would be to use String#split and then pick the second part that is the last part in this example:
"dhl_freight__233433".split('__').last
#=> "233433"
"postal__US-320202".split('__').last
#=> "US-320202"
You can use a very simple Regexp and a ask the resulting MatchData for the post_match part:
p "dhl_freight__233433".match(/__/).post_match
# another (magic) way to acces the post_match part:
p $'
Postscript: Learnt something from this question myself: you don't even have to use a RegExp for this to work. Just "asddfg__qwer".match("__").post_match does the trick (it does the conversion to regexp for you)
r = /[^_]+\z/
"dhl_freight__233433"[r] #=> "233433"
"postal__US-320202"[r] #=> "US-320202"
The regular expression matches one or more characters other than an underscore, followed by the end of the string (\z). The ^ at the beginning of the character class reads, "other than any of the characters that follow".
See String#[].
This assumes that the last underscore is preceded by an underscore. If the last underscore is not preceded by an underscore, in which case there should be no match, add a positive lookbehind:
r = /(?<=__[^_]+\z/
This requires the match to be preceded by two underscores.
There are many ruby ways to extract numbers from string. I hope you're trying to fetch numbers out of a string. Here are some of the ways to do so.
Ref- http://www.ruby-forum.com/topic/125709
line.delete("^0-9")
line.scan(/\d/).join('')
line.tr("^0-9", '')
In the above delete is the fastest to trim numbers out of strings.
All of above extracts numbers from string and joins them. If a string is like this "String-with-67829___numbers-09764" outut would be like this "6782909764"
In case if you want the numbers split like this ["67829", "09764"]
line.split(/[^\d]/).reject { |c| c.empty? }
Hope these answers help you! Happy coding :-)
I am building a tweet-like system that includes #mentions and #hashtags. Right now, I need to take a tweet that will come to the server like this:
hi [#Bob D](member:Bob D) whats the deal with [#red](tag:red)
and save it in the database as:
hi #Bob P whats the deal with #red
I have the flow of what the code looks like in my mind but can't get it to work. Basically, I need to do the following:
Scan the string for any [#...] (an array like structure that begins with an #)
Delete the paranthesis after the array like structure(so for [#Bob D](member:Bob D), remove everything in paranthesis)
Remove the brackets surrounding a substring that begins with #(meaning, delete the [] from [#...])
I will also need to do the same for #. I'm almost certain this can be done by using regular expressions the slice! method, but i'm really having trouble coming up with the regular expressions needed and the control flow.
I think it would be something like this:
a = "hi [#Bob D](member:Bob D) whats the deal with [#red](tag:red)"
substring = a.scan <regular expression here>
substring.each do |matching_substring| #the loop should get rid of the paranthesis but not the brackets
a.slice! matching_substring
end
#Something here should get rid of brackets
The problem with the code above is that I can't figure out the regex and it doesn't get rid of the brackets.
This regex should work for this
/(\[(#.*?)\]\((.*?)\))/
you can use this rubular to test it
the ? after the * makes it non-greedy so it should capture each match
the code would look something like
a = "hi [#Bob D](member:Bob D) whats the deal with [#red](tag:red)"
substring = a.scan (\[(#.*?)\]\((.*?)\))
substring.each do |matching_substring|
a.gsub(matching_substring[0], matching_substring[1]) # replaces [#Bob D](member:Bob D) with #Bob D
matching_substring[1] #the part in the brackets sans brackets
matching_substring[2] #the part in the parentheses sans parentheses
end
Consider this:
str = "hi [#Bob D](member:Bob D) whats the deal with [#red](tag:red)"
BRACKET_RE_STR = '\[
(
[##]
[^\]]+
)
\]'
PARAGRAPH_RE_STR = '\(
[^)]+
\)'
BRACKET_RE = /#{BRACKET_RE_STR}/x
PARAGRAPH_RE = /#{PARAGRAPH_RE_STR}/x
BRACKET_AND_PARAGRAPH_RE = /#{BRACKET_RE_STR}#{PARAGRAPH_RE_STR}/x
str.gsub(BRACKET_AND_PARAGRAPH_RE) { |s| s.sub(PARAGRAPH_RE, '').sub(BRACKET_RE, '\1') }
# => "hi #Bob D whats the deal with #red"
The longer, or more complex the pattern, the harder it is to maintain or update, so keep them as small as possible. Build complex patterns from simple ones so it's easier to debug and extend.
I have a string of the form "award.x_initial_value.currency" and I would like to camelize everything except the leading "x_" so that I get a result of the form: "award.x_initialValue.currency".
My current implementation is:
a = "award.x_initial_value.currency".split(".")
b = a.map{|s| s.slice!("x_")}
a.map!{|s| s.camelize(:lower)}
a.zip(b).map!{|x, y| x.prepend(y.to_s)}
I am not very happy with it since it's neither fast nor elegant and performance is key since this will be applied to large amounts of data.
I also googled it but couldn't find anything.
Is there a faster/better way of achieving this?
Since "performance is key" you could skip the overhead of ActiveSupport::Inflector and use a regular expression to perform the "camelization" yourself:
a = "award.x_initial_value.currency"
a.gsub(/(?<!\bx)_(\w)/) { $1.capitalize }
#=> "award.x_initialValue.currency"
▶ "award.x_initial_value.x_currency".split('.').map do |s|
"#{s[/\Ax_/]}#{s[/(\Ax_)?(.*)\z/, 2].camelize(:lower)}"
end.join('.')
#⇒ "award.x_initialValue.x_currency"
or, with one gsub iteration:
▶ "award.x_initial_value.x_currency".gsub(/(?<=\.|\A)(x_)?(.*?)(?=\.|\z)/) do |m|
"#{$~[1]}" << $~[2].camelize(:lower)
end
#⇒ "award.x_initialValue.x_currency"
In the latter version we use global substitution:
$~ is a short-hand to a global, storing the last regexp match occured;
$~[1] is the first matched entity, corresponding (x_)?, because of ? it might be either matched string, or nil; that’s why we use string extrapolation, in case of nil "#{nil}" will result in an empty string;
after all, we append the camelized second match to the string, discussed above;
NB Instead of $~ for the last match, one might use Regexp::last_match
Could you try solmething like this:
'award.x_initial_value.currency'.gsub(/(\.|\A)x_/,'\1#').camelize(:lower).gsub('#','x_')
# => award.x_initialValue.currency
NOTE: for # char can be used any of unused char for current name/char space.
I'm building a library that cleans up user generated content and have thousands of string replacements to make (performance is key).
What's the fastest way to do search and replacements in strings?
Here's an example of the replacements the library will make:
u2 => you too
2day => today
2moro => tomorrow
2morrow => tomorrow
2tomorow => tomorrow
There are four cases on how the string can appear:
Starting word in the string (has a space at the end, but not in front of it) 2day sample
Middle of the string (has a space in front and at the end of it) sample 2day sample
End of the string (only has a space in front, but is the last word) sample 2day
The entire string is a match 2day
i.e. The regex shouldn't replace it if it's in the middle of a word like sample2daysample
A possible solution:
replaces = {'u2' => 'you too', '2day' => 'today', '2moro' => 'tomorrow'}
str = '2day and 2moro are u2 sample2daysample'
#exp = Regexp.union(replaces.keys) #it is the best but to use \b this should be a quiet different
exp = Regexp.new(replaces.keys.map { |x| "\\b" + Regexp.escape(x) + "\\b" }.join('|'))
str = str.gsub(exp, replaces)
# => "today and tomorrow are you too sample2daysample"
Full Disclosure: I am the author of this gem
If you don't need regex you can try https://github.com/jedld/multi_string_replace this uses the aho-corasick algorithm to achieve this.
user system total real
multi gsub 1.322510 0.000000 1.322510 ( 1.344405)
MultiStringReplace 0.196823 0.007979 0.204802 ( 0.207219)
mreplace 0.200593 0.004031 0.204624 ( 0.205379)
The only issue I see is that the algorithm does not understand word boundaries so you have to decompose your use case to:
"2day ", " 2day ", " 2day"