Looping through array targeting upcase letters only - ruby-on-rails

Am trying to loop through a string which i have converted to an array and target only the upcase letters which i will then insert an empty space before the capitalized letter. My code checks for the first cap letter and adds the space but am struggling to do it for the next cap letter which in this case is "T". Any advise would be appreciated. Thanks
def break_camel(str)
# ([A-Z])/.match(str)
saved_string = str.freeze
cap_index =str.index(/[A-Z]/)
puts(cap_index)
x =str.split('').insert(cap_index, " ")
x.join
end
break_camel("camelCasingTest")

It's much easier to operate on your string directly, using String#gsub, than breaking it into pieces, operating on each piece then gluing everything back together again.
def break_camel(str)
str.gsub(/(?=[A-Z])/, ' ')
end
break_camel("camelCasingTest")
#=> "camel Casing Test"
break_camel("CamelCasingTest")
#=> " Camel Casing Test"
This converts a "zero-width position", immediately before each capital letter (and after the preceding character, if there is one), to a space. The expression (?=[A-Z]) is called a positive lookahead.
If you don't want to insert a space if the capital letter is at the beginning of a line, change the method as follows.
def break_camel(str)
str.gsub(/(?<=.)(?=[A-Z])/, ' ')
end
break_camel("CamelCasingTest")
#=> "Camel Casing Test"
(?<=.) is a positive lookbehind that requires the capital letter to be preceded by any character for the match to be made.
Another way of writing this is as follows.
def break_camel(str)
str.gsub(/(?<=.)([A-Z]))/, ' \1')
end
break_camel("CamelCasingTest")
#=> "Camel Casing Test"
Here the regular expression matches a capital letter that is not at the beginning of the line and saves it to capture group 1. It is then replaced by a space followed by the contents of capture group 1.

I think your approach is looking to keep reapplying your method until needed. One extension of your code is to use recursion:
def break_camel(str)
regex = /[a-z][A-Z]/
if str.match(regex)
cap_index = str.index(regex)
str.insert(cap_index + 1, " ")
break_camel(str)
else
str
end
end
break_camel("camelCasingTest") #=> "camel Casing Test"
Notice the break_camel method inside the method. Another way is by using the scan method passing the appropriate regex before rejoining them.
In code:
'camelCasingTest'.scan(/[A-Z]?[a-z]+/).join(' ') #=> "camel Casing Test"

Do you have to implement your own?
Looks like titleize https://apidock.com/rails/ActiveSupport/Inflector/titleize has this covered.

Related

How to find last occurrence of a substring in a given string?

I have a string, which describe some word, I must change ending of it to "sd", if ending == "jk".
For an example, I have word: "lazerjk", I need to get from it "lazersd".
I tried to use method .gsub!, but it doesn't work correctly if we have more than one occurrence of substring "jk" in a word.
String#rindex returns the index of the last occurrence of the given substring
String#[]= can take two integers arguments, first is index where start to replace and second - length of replaced string
You can use them this way:
replaced = "foo"
replacing = "booo"
string = "foo bar foo baz"
string[string.rindex(replaced), replaced.size] = replacing
string
# => "foo bar booo baz"
"jughjkjkjk\njk".sub(/jk$\z/, 'sd')
=> "jughjkjkjk\nsd"
without $ is probably sufficient.
It sounds like you're looking to replace a specific suffix only. If so, I would probably suggest using sub along with an anchored regex (to check for the desired characters only at the end of the string):
string_1 = "lazerjk"
string_2 = "lazerjk\njk"
string_3 = "lazerjkr"
string_1.sub(/jk\z/, "sd")
#=> "lazersd"
string_2.sub(/jk\z/, "sd")
#=> "lazerjk\nsd"
string_3.sub(/jk\z/, "sd")
#=> "lazerjkr"
Or, you could do without a regex at all by using the reverse! method along with a simple conditional statement to sub! only when the suffix is present:
string = "lazerjk"
old_suffix = "jk"
new_suffix = "sd"
string.reverse!.sub!(old_suffix.reverse, new_suffix.reverse).reverse! if string.end_with? (old_suffix)
string
#=> "lazersd"
OR, you could even use a completely different approach. Here's an example using chomp to remove the unwanted suffix and then ljust to pad the desired suffix to the modified string.
string = "lazerjk"
string.chomp("jk").ljust(string.length, "sd")
#=> "lazersd"
Note that the new suffix only gets added if the length of the string was modified with the initial chomp. Otherwise, the string remains unchanged.
If the goal is to substitute the LAST OCCURRENCE (as opposed to suffix only), then this could be accomplished by using sub along with reverse:
string = "jklazerjkm"
old_substring = "jk"
new_substring = "sd"
string.reverse.sub(old_substring.reverse, new_substring.reverse).reverse
#=> "jklazersdm"
Replacing "jk" at the end of a string with something else is straightforward and can be addressed without concern for other instances of "jk" that may be in the string, so I assume that is not what is being asked. Rather, I assume the problem is to replace the last instance of "jk" in a string with "sd".
Here are two solutions that make use of String#sub with a regular expression.
Use a negative lookahead
The idea here is to match "jk" provided it is not followed later in the string by another instance of "jk".
"lajkz\nejkrjklm".sub(/jk(?!.*jk)/m, "sd")
#=> "lajkz\nejkrsdlm"
Capture the part of the string that precedes the last "jk"
The match, if there is one, consists of the front of the string followed by the last "jk", which is replaced by the captured string followed by "sd".
"lajkz\nejkrjklm".sub(/\A(.*)jk/m) { $1 + "sd" }
#=> "lajkz\nejkrsdlm"
The two regular expressions can be written in free-spacing mode to make them self-documenting. The first is the following.
/
jk # match literal
(?! # begin a negative lookahead
.* # match zero or more characters other than line terminators
jk # match literal
) # end negative lookahead
/mx # invoke multiline and free-spacing regex definition modes.
Multiline mode causes . to match any character, including a line terminator.
The second regular expression can be written as follows.
\A # match the beginning of the string
(.*) # match zero or more characters other than line terminators
# and save the match to capture group 1
jk # match literal
/mx # invoke multiline and free-spacing regex definition modes.
Note that in both expressions .* is greedy, meaning that it will match as many characters as possible, including "jk" so long as other requirements of the expression are met, here that the last instance of "jk" in the string is matched.
Here is a different solution:
str = "jughjkjkjk\njk"
pattern = "jk"
replace_with = "sd"
str = str.reverse.sub(pattern.reverse, replace_with.reverse).reverse

Match a word or whitespaces in Lua

(Sorry for my broken English)
What I'm trying to do is matching a word (with or without numbers and special characters) or whitespace characters (whitespaces, tabs, optional new lines) in a string in Lua.
For example:
local my_string = "foo bar"
my_string:match(regex) --> should return 'foo', ' ', 'bar'
my_string = " 123!#." -- note: three whitespaces before '123!#.'
my_string:match(regex) --> should return ' ', ' ', ' ', '123!#.'
Where regex is the Lua regular expression pattern I'm asking for.
Of course I've done some research on Google, but I couldn't find anything useful. What I've got so far is [%s%S]+ and [%s+%S+] but it doesn't seem to work.
Any solution using the standart library, e.g. string.find, string.gmatch etc. is OK.
Match returns either captures or the whole match, your patterns do not define those. [%s%S]+ matches "(space or not space) multiple times more than once", basically - everything. [%s+%S+] is plain wrong, the character class [ ] is a set of single character members, it does not treat sequences of characters in any other way ("[cat]" matches "c" or "a"), nor it cares about +. The [%s+%S+] is probably "(a space or plus or not space or plus) single character"
The first example 'foo', ' ', 'bar' could be solved by:
regex="(%S+)(%s)(%S+)"
If you want a variable number of captures you are going to need the gmatch iterator:
local capt={}
for q,w,e in my_string:gmatch("(%s*)(%S+)(%s*)") do
if q and #q>0 then
table.insert(capt,q)
end
table.insert(capt,w)
if e and #e>0 then
table.insert(capt,e)
end
end
This will not however detect the leading spaces or discern between a single space and several, you'll need to add those checks to the match result processing.
Lua standard patterns are simplistic, if you are going to need more intricate matching, you might want to have a look at lua lpeg library.

Ruby regex to find words starting with #

I'm trying to write a very simple regex to find all words in a string that start with the symbol #. Then change the word to a link. Like you would see in a Twitter where you can mention other usernames.
So far I have written this
def username_link(s)
s.gsub(/\#\w+/, "<a href='/username'>username</a>").html_safe
end
I know it's very basic and not much, but I'd rather write it on my own right now, to fully understand it, before searching GitHub to find a more complex one.
What I'm trying to find out is how can I reference that matched word and include it in the place of username. Once I can do that i can easily strip the first character, #, out of it.
Thanks.
You can capture using parentheses and backreference with \1 (and \2, and so on):
def username_link(s)
s.gsub(/#(\w+)/, "<a href='/\\1'>\\1</a>").html_safe
end
See also this answer
You should use gsub with back references:
str = "I know it's very basic and not much, but #tim I'd rather write it on my own."
def username_to_link(str)
str.gsub(/\#(\w+)/, '#\1')
end
puts username_to_link(str)
#=> I know it's very basic and not much, but #tim I'd rather write it on my own.
Following Regex should handle corner cases which other answers ignore
def auto_username_link(s)
s.gsub(/(^|\s)\#(\w+)($|\s)/, "\\1<a href='/\\2'>\\2</a>\\3").html_safe
end
It should ignore strings like "someone#company" or "#username-1" while converting everything like "Hello #username rest of message"
How about this:
def convert_names_to_links(str)
str = " " + str
result = str.gsub(
/
(?<=\W) #Look for a non-word character(space/punctuation/etc.) preceeding
# #an "#" character, followed by
(\w+) #a word character, one or more times
/xm, #Standard normalizing flags
'#\1'
)
result[1..-1]
end
my_str = "#tim #tim #tim, ##tim,#tim t#mmy?"
puts convert_names_to_links(my_str)
--output:--
#tim #tim #tim, ##tim,#tim t#mmy?

select multiple sections of a string in rails

I'm trying to slice up a string and having trouble.
In rails, I have a very long string, and in it, something like this occurs 3-6 times:
bunchofotherstringstuffandcharacters"hisquote":"The most important aspect of the painting was the treatment of lighting.","lp":andthenalotmorestringandcharacters
I want to slice out "The most important aspect of the painting was the treatment of lighting.", and also other instances that fall between hisquote and lp.
The "hisquote" that comes before it is unique to the strings I want, and so is the .","lp that comes after it
How can I get back all the instances of the strings between these two identifiers?
So something like this? I'm assuming that your delimiters : and , are consistent throughout the string, as well as the use of double quotes " to enclose the desired strings.
# escape double quotes
longstring = %q(bunchofotherstringstuffandcharacters"hisquote":"The most important aspect of the painting was the treatment of lighting.","lp":andthenalotmorestringandcharacters)
# split on double quotes
substrings = longstring.split("\"").to_enum
# somewhere to sure the strings you want
save = []
# use a rescue clause to detect that the enumerator 'substrings' as reached an end
begin
while true do
remember = substrings.next
case substrings.peek # lets see if that next element is our deliminator
when ":" # Once the semicolon is spotted ahead, grab the three strings we want.
save << remember
substrings.next # skip the ":"
save << substrings.next
substrings.next # skip the ","
save << substrings.next
end
end
rescue StopIteration => e
puts "End of Substring Enumeration was reached."
ensure
puts save.inspect #=> ["hisquote", "The most important aspect of the painting was the treatment of lighting.", "lp"]
end

Best way to count words in a string in Ruby?

Is there anything better than string.scan(/(\w|-)+/).size (the - is so, e.g., "one-way street" counts as 2 words instead of 3)?
string.split.size
Edited to explain multiple spaces
From the Ruby String Documentation page
split(pattern=$;, [limit]) → anArray
Divides str into substrings based on a delimiter, returning an array
of these substrings.
If pattern is a String, then its contents are used as the delimiter
when splitting str. If pattern is a single space, str is split on
whitespace, with leading whitespace and runs of contiguous whitespace
characters ignored.
If pattern is a Regexp, str is divided where the pattern matches.
Whenever the pattern matches a zero-length string, str is split into
individual characters. If pattern contains groups, the respective
matches will be returned in the array as well.
If pattern is omitted, the value of $; is used. If $; is nil (which is
the default), str is split on whitespace as if ' ' were specified.
If the limit parameter is omitted, trailing null fields are
suppressed. If limit is a positive number, at most that number of
fields will be returned (if limit is 1, the entire string is returned
as the only entry in an array). If negative, there is no limit to the
number of fields returned, and trailing null fields are not
suppressed.
" now's the time".split #=> ["now's", "the", "time"]
While that is the current version of ruby as of this edit, I learned on 1.7 (IIRC), where that also worked. I just tested it on 1.8.3.
I know this is an old question, but this might be useful to someone else looking for something more sophisticated than string.split. I wrote the words_counted gem to solve this particular problem, since defining words is pretty tricky.
The gem lets you define your own custom criteria, or use the out of the box regexp, which is pretty handy for most use cases. You can pre-filter words with a variety of options, including a string, lambda, array, or another regexp.
counter = WordsCounted::Counter.new("Hello, Renée! 123")
counter.word_count #=> 2
counter.words #=> ["Hello", "Renée"]
# filter the word "hello"
counter = WordsCounted::Counter.new("Hello, Renée!", reject: "Hello")
counter.word_count #=> 1
counter.words #=> ["Renée"]
# Count numbers only
counter = WordsCounted::Counter.new("Hello, Renée! 123", rexexp: /[0-9]/)
counter.word_count #=> 1
counter.words #=> ["123"]
The gem provides a bunch more useful methods.
If the 'word' in this case can be described as an alphanumeric sequence which can include '-' then the following solution may be appropriate (assuming that everything that doesn't match the 'word' pattern is a separator):
>> 'one-way street'.split(/[^-a-zA-Z]/).size
=> 2
>> 'one-way street'.split(/[^-a-zA-Z]/).each { |m| puts m }
one-way
street
=> ["one-way", "street"]
However, there are some other symbols that can be included in the regex - for example, ' to support the words like "it's".
This is pretty simplistic but does the job if you are typing words with spaces in between. It ends up counting numbers as well but I'm sure you could edit the code to not count numbers.
puts "enter a sentence to find its word length: "
word = gets
word = word.chomp
splits = word.split(" ")
target = splits.length.to_s
puts "your sentence is " + target + " words long"
The best way to do is to use split method.
split divides a string into sub-strings based on a delimiter, returning an array of the sub-strings.
split takes two parameters, namely; pattern and limit.
pattern is the delimiter over which the string is to be split into an array.
limit specifies the number of elements in the resulting array.
For more details, refer to Ruby Documentation: Ruby String documentation
str = "This is a string"
str.split(' ').size
#output: 4
The above code splits the string wherever it finds a space and hence it give the number of words in the string which is indirectly the size of the array.
The above solution is wrong, consider the following:
"one-way street"
You will get
["one-way","", "street"]
Use
'one-way street'.gsub(/[^-a-zA-Z]/, ' ').split.size
This splits words only on ASCII whitespace chars:
p " some word\nother\tword|word".strip.split(/\s+/).size #=> 4

Resources