Apply modification only to substring in Ruby - ruby-on-rails

I have a string of the form "award.x_initial_value.currency" and I would like to camelize everything except the leading "x_" so that I get a result of the form: "award.x_initialValue.currency".
My current implementation is:
a = "award.x_initial_value.currency".split(".")
b = a.map{|s| s.slice!("x_")}
a.map!{|s| s.camelize(:lower)}
a.zip(b).map!{|x, y| x.prepend(y.to_s)}
I am not very happy with it since it's neither fast nor elegant and performance is key since this will be applied to large amounts of data.
I also googled it but couldn't find anything.
Is there a faster/better way of achieving this?

Since "performance is key" you could skip the overhead of ActiveSupport::Inflector and use a regular expression to perform the "camelization" yourself:
a = "award.x_initial_value.currency"
a.gsub(/(?<!\bx)_(\w)/) { $1.capitalize }
#=> "award.x_initialValue.currency"

▶ "award.x_initial_value.x_currency".split('.').map do |s|
"#{s[/\Ax_/]}#{s[/(\Ax_)?(.*)\z/, 2].camelize(:lower)}"
end.join('.')
#⇒ "award.x_initialValue.x_currency"
or, with one gsub iteration:
▶ "award.x_initial_value.x_currency".gsub(/(?<=\.|\A)(x_)?(.*?)(?=\.|\z)/) do |m|
"#{$~[1]}" << $~[2].camelize(:lower)
end
#⇒ "award.x_initialValue.x_currency"
In the latter version we use global substitution:
$~ is a short-hand to a global, storing the last regexp match occured;
$~[1] is the first matched entity, corresponding (x_)?, because of ? it might be either matched string, or nil; that’s why we use string extrapolation, in case of nil "#{nil}" will result in an empty string;
after all, we append the camelized second match to the string, discussed above;
NB Instead of $~ for the last match, one might use Regexp::last_match

Could you try solmething like this:
'award.x_initial_value.currency'.gsub(/(\.|\A)x_/,'\1#').camelize(:lower).gsub('#','x_')
# => award.x_initialValue.currency
NOTE: for # char can be used any of unused char for current name/char space.

Related

Changing text based on the final letter of user name using regular expression

I am looking to change the ending of the user name based on the use case (in the language system will operate, names ends depending on how it is used).
So need to define all endings of names and define the replacement for them.
Was suggested to use .gsub regular expression to search and replace in a string:
Changing text based on the final letter of user name
"name surname".gsub(/e\b/, 'ai')
this will replace e with ai, so "name surname = namai surnamai".
How can it be used for more options like: "e = ai, us = mi, i = as" on the same record?
thanks
You can use String#gsub with block. Docs say:
In the block form, the current match string is passed in as a parameter, and variables such as $1, $2, $`, $&, and $' will be set appropriately. The value returned by the block will be substituted for the match on each call.
So you can use a regex with concatenation of all substrings to be replaced and then replace it in the block, e.g. using a hash that maps matches to replacements.
Full example:
replacements = {'e'=>'ai', 'us'=>'mi', 'i' => 'as'}
['surname', 'surnamus', 'surnami'].map do |s|
s.gsub(/(e|us|i)$/){|p| replacements[p] }
end
#Sundeep makes an important observation in a comment on the question. If, for example, the substitutions were give by the following hash:
g = {'e'=>'ai', 's'=>'es', 'us'=>'mi', 'i' => 'as'}
#=> {"e"=>"ai", "s"=>"es", "us"=>"mi", "i"=>"as"}
'surnamus' would be converted (incorrectly) to 'surnamues' merely because 's'=>'es' precedes 'us'=>'mi' in g. That situation may not exist at present, but it may be prudent to allow for it in future, particularly because it is so simple to do so:
h = g.sort_by { |k,_| -k.size }.to_h
#=> {"us"=>"mi", "e"=>"ai", "s"=>"es", "i"=>"as"}
arr = ['surname', 'surnamus', 'surnami', 'surnamo']
The substitutions can be done using the form of String##sub that employs a hash as its second argument.
r = /#{Regexp.union(h.keys)}\z/
#=> /(?-mix:us|e|s|i)\z/i
arr.map { |s| s.sub(r,h) }
#=> ["surnamai", "surnammi", "surnamas", "surnamo"]
See also Regexp::union.
Incidentally, though key-insertion order has been guaranteed for hashes since Ruby v1.9, there is a continuing debate as to whether that property should be made use of in Ruby code, mainly because there was no concept of key order when hashes were first used in computer programs. This answer provides a good example of the benefit of exploiting key order.

Remove quotes from string built from an array

I have user controller input like so (the length and # of items may change):
str = "['honda', 'toyota', 'lexus']"
I would like to convert this into an array, but I'm struggling to find the best way to do so. eval() does exactly what I need, but it is not very elegant and is dangerous in this case, since it's user controller input.
Another way is:
str[1..-2].split(',').collect { |car| car.strip.tr("'", '') }
=> ["honda", "toyota", "lexus"]
But this is also not very elegant. Any suggestions that are more 'Rubyish'?
You could use a regular expression:
# match (in a non-greedy way) characters up to a comma or `]`
# capture each word as a group, and don't capture `,` or `]`
str.scan(/'(.+?)'(?:,|\])/).flatten
Or JSON.parse (but accounting for the fact that single quotes are in fact technically not allowed in JSON):
JSON.parse( str.tr("'", '"') )
JSON.parse probably has a small edge over the regexp in terms of performance, but if you're expecting your users to do single quote escaping, then that tr is going to mess things up. In this case, I'd stick with the regexp.
The JSON.parse looks more correct, but here is another alternative:
str.split(/[[:punct:] ]+/).drop(1)

Ruby on Rails: Checking for valid regex does not work properly, high false rate

In my application I've got a procedure which should check if an input is valid or not. You can set up a regex for this input.
But in my case it returns false instead of true. And I can't find the problem.
My code looks like this:
gaps.each_index do | i |
if gaps[i].first.starts_with? "~"
# regular expression
begin
regex = gaps[i].first[1..-1]
# a pipe is used to seperate the regex from the solution-string
if regex.include? "|"
puts "REGEX FOUND ------------------------------------------"
regex = regex.split("|")[0..-2].join("|")
end
reg = Regexp.new(regex, true)
unless reg.match(data[i])
puts "REGEX WRONGGGG -------------------"
#wrong_indexes << i
end
rescue
end
else
# normal string
if data[i].nil? || data[i].strip != gaps[i].first.strip
#wrong_indexes << i
end
end
An example would be:
[[~berlin|berlin]]
The left one before the pipe is the regex and the right one next to the pipe is the correct solution.
This easy input should return true, but it doesn't.
Does anyone see the problem?
Thank you all
EDIT
Somewhere in this lines must be the problem:
if regex.include? "|"
puts "REGEX FOUND ------------------------------------------"
regex = regex.split("|")[0..-2].join("|")
end
reg = Regexp.new(regex, true)
unless reg.match(data[i])
Update: Result without ~
The whole point is that you are initializing regex using the Regexp constructor
Constructs a new regular expression from pattern, which can be either a String or a Regexp (in which case that regexp’s options are propagated, and new options may not be specified (a change as of Ruby 1.8).
However, when you pass the regex (obtained with regex.split("|")[0..-2].join("|")) to the constructor, it is a string, and reg = Regexp.new(regex, true) is getting ~berlin (or /berlin/i) as a literal string pattern. Thus, it actually is searching for something you do not expect.
See, regex= "[[/berlin/i|berlin]]" only finds a *literal /berlin/i text (see demo).
Also, you need to get the pattern from the [[...]], so strip these brackets with regex = regex.gsub(/\A\[+|\]+\z/, '').split("|")[0..-2].join("|").
Note you do not need to specify the case insensitive options, since you already pass true as the second parameter to Regexp.new, it is already case-insensitive.
If you are performing whole word lookup, add word boundaries: regex= "[[\\bberlin\\b|berlin]]" (see demo).

Rails: Given a String, check if an Array (of strings) contains a substring of String

Is there a more Railsy way to do this (without explicit regex, perhaps?):
array_o_strings = ["some strings", "I'd like", "to parse"]
string = "like to parse"
re = Regexp.union(array_o_strings.map { |i| Regexp.new(i) })
string =~ re
Just pining for magical Rails methods.
There's really nothing wrong with using a regular expression here if that's your intent. It's generally more efficient to use one of those than to go through the trouble of comparing arrays.
It's worth noting you don't have to do that much work to get this:
re = Regexp.union(array)
That should handle automatically escaping those strings and compiling them into a singular regular expression. Test with strings containing * and ? to be sure.
One note to add on style is that the =~ operator is a hold-over from Perl. It's preferable to use string.match(re) to make it clear what's going on there.
How big is the array? It may be worth comparing the speed using a regex vs checking each element. If the array is sorted shortest to longest that would help when checking one by one as you're more likely to find a match first.
In any event, this is one way:
array_o_strings.any?{|e| string.index(e) }

Break strings into substrings based on delimiters, with empty substrings

I am using LUA to create a table within a table, and am running into an issue. I need to also populate the NIL values that appear, but can not seem to get it right.
String being manipulated:
PatID = '07-26-27~L73F11341687Per^^^SCI^SP~N7N558300000Acc^'
for word in PatID:gmatch("[^\~w]+") do table.insert(PatIDTable,word) end
local _, PatIDCount = string.gsub(PatID,"~","")
PatIDTableB = {}
for i=1, PatIDCount+1 do
PatIDTableB[i] = {}
end
for j=1, #PatIDTable do
for word in PatIDTable[j]:gmatch("[^\^]+") do
table.insert(PatIDTableB[j], word)
end
end
This currently produces this output:
table
[1]=table
[1]='07-26-27'
[2]=table
[1]='L73F11341687Per'
[2]='SCI'
[3]='SP'
[3]=table
[1]='N7N558300000Acc'
But I need it to produce:
table
[1]=table
[1]='07-26-27'
[2]=table
[1]='L73F11341687Per'
[2]=''
[3]=''
[4]='SCI'
[5]='SP'
[3]=table
[1]='N7N558300000Acc'
[2]=''
EDIT:
I think I may have done a bad job explaining what it is I am looking for. It is not necessarily that I want the karats to be considered "NIL" or "empty", but rather, that they signify that a new string is to be started.
They are, I guess for lack of a better explanation, position identifiers.
So, for example:
L73F11341687Per^^^SCI^SP
actually translates to:
1. L73F11341687Per
2.
3.
4. SCI
5. SP
If I were to have
L73F11341687Per^12ABC^^SCI^SP
Then the positions are:
1. L73F11341687Per
2. 12ABC
3.
4. SCI
5. SP
And in turn, the table would be:
table
[1]=table
[1]='07-26-27'
[2]=table
[1]='L73F11341687Per'
[2]='12ABC'
[3]=''
[4]='SCI'
[5]='SP'
[3]=table
[1]='N7N558300000Acc'
[2]=''
Hopefully this sheds a little more light on what I'm trying to do.
Now that we've cleared up what the question is about, here's the issue.
Your gmatch pattern will return all of the matching substrings in the given string. However, your gmatch pattern uses "+". That means "one or more", which therefore cannot match an empty string. If it encounters a ^ character, it just skips it.
But, if you just tried :gmatch("[^\^]*"), which allows empty matches, the problem is that it would effectively turn every ^ character into an empty match. Which is not what you want.
What you want is to eat the ^ at the end of a substring. But, if you try :gmatch("([^\^])\^"), you'll find that it won't return the last string. That's because the last string doesn't end with ^, so it isn't a valid match.
The closest you can get with gmatch is this pattern: "([^\^]*)\^?". This has the downside of putting an empty string at the end. However, you can just remove that easily enough, since one will always be placed there.
local s0 = '07-26-27~L73F11341687Per^^^SCI^SP~N7N558300000Acc^'
local tt = {}
for s1 in (s0..'~'):gmatch'(.-)~' do
local t = {}
for s2 in (s1..'^'):gmatch'(.-)^' do
table.insert(t, s2)
end
table.insert(tt, t)
end

Resources