Suppose I have string variables like following:
s1="10$"
s2="10$ I am a student"
s3="10$Good"
s4="10$ Nice weekend!"
As you see above, s2 and s4 have white space(s) after 10$ .
Generally, I would like to have a way to check if a string start with 10$ and have white-space(s) after 10$ . For example, The rule should find s2 and s4 in my above case. how to define such rule to check if a string start with '10$' and have white space(s) after?
What I mean is something like s2.RULE? should return true or false to tell if it is the matched string.
---------- update -------------------
please also tell the solution if 10# is used instead of 10$
You can do this using Regular Expressions (Ruby has Perl-style regular expressions, to be exact).
# For ease of demonstration, I've moved your strings into an array
strings = [
"10$",
"10$ I am a student",
"10$Good",
"10$ Nice weekend!"
]
p strings.find_all { |s| s =~ /\A10\$[ \t]+/ }
The regular expression breaks down like this:
The / at the beginning and the end tell Ruby that everything in between is part of the regular expression
\A matches the beginning of a string
The 10 is matched verbatim
\$ means to match a $ verbatim. We need to escape it since $ has a special meaning in regular expressions.
[ \t]+ means "match at least one blank and/or tab"
So this regular expressions says "Match every string that starts with 10$ followed by at least one blank or tab character". Using the =~ you can test strings in Ruby against this expression. =~ will return a non-nil value, which evaluates to true if used in a conditional like if.
Edit: Updated white space matching as per Asmageddon's suggestion.
this works:
"10$ " =~ /^10\$ +/
and returns either nil when false or 0 when true. Thanks to Ruby's rule, you can use it directly.
Use a regular expression like this one:
/10\$\s+/
EDIT
If you use =~ for matching, note that
The =~ operator returns the character position in the string of the
start of the match
So it might return 0 to denote a match. Only a return of nil means no match.
See for example http://www.regular-expressions.info/ruby.html on a regular expression tutorial for ruby.
If you want to proceed to cases with $ and # then try this regular expression:
/^10[\$#] +/
Related
I want to match the options between two arrays with exact string.
options = ["arish1", "arish2", "ARISH3", "arish 2", "arish"]
choices = ["Arish"]
final_choice = options.grep(Regexp.new(choices.join('|'), Regexp::IGNORECASE))
p final_choice
Output:
["arish1", "arish2", "ARISH3", "arish 2", "arish"]
but it should be only match "arish"
You need to use
final_choice = options.grep(/\A(?:#{Regexp.union(choices).source})\z/i)
See the Ruby online demo.
Note:
A regex literal notation is much tidier than the constructor notation
You can still use a variable inside the regex literal
The Regexp.union method joins the alternatives in choices using | "or" regex operator and escapes the items as necessary automatically
\A anchor matches the start of stirng and \z matches the end of stirng.
The non-capturing group, (?:...), is used to make sure the anchors are applied to each alternative in choices separately.
.source is used to obtain just the pattern part from the regex.
I am building a Rails 5.2 app.
In this app I got outputs from different suppliers (I am building a webshop).
The name of the shipping provider is in this format:
dhl_freight__233433
It could also be in this format:
postal__US-320202
How can I remove all that is before (and including) the __ so all that remains are the things after the ___ like for example 233433.
Perhaps some sort of RegEx.
A very simple approach would be to use String#split and then pick the second part that is the last part in this example:
"dhl_freight__233433".split('__').last
#=> "233433"
"postal__US-320202".split('__').last
#=> "US-320202"
You can use a very simple Regexp and a ask the resulting MatchData for the post_match part:
p "dhl_freight__233433".match(/__/).post_match
# another (magic) way to acces the post_match part:
p $'
Postscript: Learnt something from this question myself: you don't even have to use a RegExp for this to work. Just "asddfg__qwer".match("__").post_match does the trick (it does the conversion to regexp for you)
r = /[^_]+\z/
"dhl_freight__233433"[r] #=> "233433"
"postal__US-320202"[r] #=> "US-320202"
The regular expression matches one or more characters other than an underscore, followed by the end of the string (\z). The ^ at the beginning of the character class reads, "other than any of the characters that follow".
See String#[].
This assumes that the last underscore is preceded by an underscore. If the last underscore is not preceded by an underscore, in which case there should be no match, add a positive lookbehind:
r = /(?<=__[^_]+\z/
This requires the match to be preceded by two underscores.
There are many ruby ways to extract numbers from string. I hope you're trying to fetch numbers out of a string. Here are some of the ways to do so.
Ref- http://www.ruby-forum.com/topic/125709
line.delete("^0-9")
line.scan(/\d/).join('')
line.tr("^0-9", '')
In the above delete is the fastest to trim numbers out of strings.
All of above extracts numbers from string and joins them. If a string is like this "String-with-67829___numbers-09764" outut would be like this "6782909764"
In case if you want the numbers split like this ["67829", "09764"]
line.split(/[^\d]/).reject { |c| c.empty? }
Hope these answers help you! Happy coding :-)
I have a name spaced class..
"CommonCar::RedTrunk"
I need to convert it to an underscored string "common_car_red_trunk", but when I use
"CommonCar::RedTrunk".underscore, I get "common_car/red_trunk" instead.
Is there another method to accomplish what I need?
Solutions:
"CommonCar::RedTrunk".gsub(':', '').underscore
or:
"CommonCar::RedTrunk".sub('::', '').underscore
or:
"CommonCar::RedTrunk".tr(':', '').underscore
Alternate:
Or turn any of these around and do the underscore() first, followed by whatever method you want to use to replace "/" with "_".
Explanation:
While all of these methods look basically the same, there are subtle differences that can be very impactful.
In short:
gsub() – uses a regex to do pattern matching, therefore, it's finding any occurrence of ":" and replacing it with "".
sub() – uses a regex to do pattern matching, similarly to gsub(), with the exception that it's only finding the first occurrence (the "g" in gsub() meaning "global"). This is why when using that method, it was necessary to use "::", otherwise a single ":" would have been left. Keep in mind with this method, it will only work with a single-nested namespace. Meaning "CommonCar::RedTrunk::BigWheels" would have been transformed to "CommonCarRedTrunk::BigWheels".
tr() – uses the string parameters as arrays of single character replacments. In this case, because we're only replacing a single character, it'll work identically to gsub(). However, if you wanted to replace "on" with "EX", for example, gsub("on", "EX") would produce "CommEXCar::RedTrunk" while tr("on", "EX") would produce "CEmmEXCar::RedTruXk".
Docs:
https://apidock.com/ruby/String/gsub
https://apidock.com/ruby/String/sub
https://apidock.com/ruby/String/tr
This is a pure-Ruby solution.
r = /(?<=[a-z])(?=[A-Z])|::/
"CommonCar::RedTrunk".gsub(r, '_').downcase
#=> "common_car_red_trunk"
See (the first form of) String#gsub and String#downcase.
The regular expression can be made self-documenting by writing it in free-spacing mode:
r = /
(?<=[a-z]) # assert that the previous character is lower-case
(?=[A-Z]) # assert that the following character is upper-case
| # or
:: # match '::'
/x # free-spacing regex definition mode
(?<=[a-z]) is a positive lookbehind; (?=[A-Z]) is a positive lookahead.
Note that /(?<=[a-z])(?=[A-Z])/ matches an empty ("zero-width") string. r matches, for example, the empty string between 'Common' and 'Car', because it is preceeded by a lower-case letter and followed by an upper-case letter.
I don't know Rails but I'm guessing you could write
"CommonCar::RedTrunk".delete(':').underscore
I need a regex that will only find matches where the entire string matches my query.
For instance if I do a search for movies with the name "Red October" I only want to match on that exact title (case insensitive) but not match titles like "The Hunt For Red October". Not quite sure I know how to do this. Anyone know?
Thanks!
Try the following regular expression:
^Red October$
By default, regular expressions are case sensitive. The ^ marks the start of the matching text and $ the end.
Generally, and with default settings, ^ and $ anchors are a good way of ensuring that a regex matches an entire string.
A few caveats, though:
If you have alternation in your regex, be sure to enclose your regex in a non-capturing group before surrounding it with ^ and $:
^foo|bar$
is of course different from
^(?:foo|bar)$
Also, ^ and $ can take on a different meaning (start/end of line instead of start/end of string) if certain options are set. In text editors that support regular expressions, this is usually the default behaviour. In some languages, especially Ruby, this behaviour cannot even be switched off.
Therefore there is another set of anchors that are guaranteed to only match at the start/end of the entire string:
\A matches at the start of the string.
\Z matches at the end of the string or before a final line break.
\z matches at the very end of the string.
But not all languages support these anchors, most notably JavaScript.
I know that this may be a little late to answer this, but maybe it will come handy for someone else.
Simplest way:
var someString = "...";
var someRegex = "...";
var match = Regex.Match(someString , someRegex );
if(match.Success && match.Value.Length == someString.Length){
//pass
} else {
//fail
}
Use the ^ and $ modifiers to denote where the regex pattern sits relative to the start and end of the string:
Regex.Match("Red October", "^Red October$"); // pass
Regex.Match("The Hunt for Red October", "^Red October$"); // fail
You need to enclose your regex in ^ (start of string) and $ (end of string):
^Red October$
If the string may contain regex metasymbols (. { } ( ) $ etc), I propose to use
^\QYourString\E$
\Q starts quoting all the characters until \E.
Otherwise the regex can be unappropriate or even invalid.
If the language uses regex as string parameter (as I see in the example), double slash should be used:
^\\QYourString\\E$
Hope this tip helps somebody.
Sorry, but that's a little unclear.
From what i read, you want to do simple string compare. You don't need regex for that.
string myTest = "Red October";
bool isMatch = (myTest.ToLower() == "Red October".ToLower());
Console.WriteLine(isMatch);
isMatch = (myTest.ToLower() == "The Hunt for Red October".ToLower());
You can do it like this Exemple if i only want to catch one time the letter minus a in a string and it can be check with myRegex.IsMatch()
^[^e][e]{1}[^e]$
Short version:
I am having a rather hard time understanding two rather complex regular expressions in the ActiveSupport::Inflector::camelize method.
This is the definition of the camelize method:
def camelize(term, uppercase_first_letter = true)
string = term.to_s
if uppercase_first_letter
string = string.sub(/^[a-z\d]*/) { inflections.acronyms[$&] || $&.capitalize }
else
string = string.sub(/^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/) { $&.downcase }
end
string.gsub(/(?:_|(\/))([a-z\d]*)/i) { "#{$1}#{inflections.acronyms[$2] || $2.capitalize}" }.gsub('/', '::')
end
I have some difficulty understanding:
string = string.sub(/^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/) { $&.downcase }
and:
string.gsub(/(?:_|(\/))([a-z\d]*)/i) { "#{$1}#{inflections.acronyms[$2] || $2.capitalize}" }.gsub('/', '::')
Please explain to me what they mean. Thank you.
Long version
This shows me trying to understand the regex and how I interpret them to mean. It would be very helpful if you could go through this and correct my mistakes.
For the first regex
string = string.sub(/^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/) { $&.downcase }
Based on what I am seeing, inflections.acronym_regex is from the Inflections class in the ActiveSupport::Inflector module, and in the initialize method of the Inflections class,
def initialize
#plurals, #singulars, #uncountables, #humans, #acronyms, #acronym_regex = [], [], [], [], {}, /(?=a)b/
end
acronym_regex is assigned /(?=a)b/. From what I understand from http://www.ruby-doc.org/core-2.0.0/Regexp.html#class-Regexp-label-Anchors ,
(?=pat) - Positive lookahead assertion: ensures that the following characters match pat, but doesn't include those characters in the matched text
So /(?=a)b/ ensures that character a is inside the text, but we dont include character a inside the matched text, and what immediately follows character a must be character b. In other words, "abc" would match this regex, but "bbc" would not match this regex, and the matched text for "abc" would be "b" (instead of "ab").
So combining the value of inflections.acronym_regex into this regex /^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/, I do not know which of the following two regex results:
A. /^(?:/(?=a)b/(?=\b|[A-Z_])|\w)/
B. /^(?:(?=a)b(?=\b|[A-Z_])|\w)/
although I am thinking it is B. From what I understand, (?: provides grouping without capturing, (?= means positive lookahead assertion, \b matches word boundaries when outside brackets and matches backspace when inside brackets. So in english terms, regex B, when matching against a text, will find a string that begins with an a character, followed by a b character, and one of (1. backspace [whatever that may mean] 2. any uppercase character or underscore 3. any english alphabetic character, digit, or underscore).
However, I find it strange that passing upper_case_first_letter = false to the camelize function should cause it to match a string starting with the characters ab, given that that does not seem to be how the camelize function behaves.
For the second regex
string.gsub(/(?:_|(\/))([a-z\d]*)/i) { "#{$1}#{inflections.acronyms[$2] || $2.capitalize}" }.gsub('/', '::')
The regex is:
/(?:_|(\/))([a-z\d]*)/i
I am guessing that this regex will match a substring that starts with either an _ or /, followed by 0 or more (upper or lowercase english alpabetic characters or digit). Furthermore, for the first group (?:_|(\/)), whether we match the _ or /, the ([a-z\d]*) capturing group will always be regarded as the second group. I do understand the part where the block tries to look up inflections.acronyms[$2] and on failure, does $2.captitalize.
Since (?: means grouping without capturing, what is the value of $1 when we match _ ? Is it still _ ? And for the .gsub('/', '::') portion, I am guessing that it gets applied for each match in the initial gsub, instead of being applied to the overall string after the outer gsub call is done?
Apologies for the really long post. Please point out my errors in understanding the 2 regular expressions, or explain them in a better way if you can do it.
Thank you.
However, I find it strange that passing upper_case_first_letter =
false to the camelize function should cause it to match a string
starting with the characters ab, given that that does not seem to be
how the camelize function behaves.
?: acts like a . here and does match the string (ie. single character) but there is no grouping, therefore the match is in $&.
Since (?: means grouping without capturing, what is the value of $1
when we match _ ? Is it still _ ?
It's nil since there is no capturing. The value is in $2
And for the .gsub('/', '::') portion, I am guessing that it gets
applied for each match in the initial gsub, instead of being applied
to the overall string after the outer gsub call is done?
It's applied to the overall result as gsub with block returns a string and the gsub('/', '::') is outside of a block.