If i have some test data john_doe01.jpg how can i use regex or something else to split this into separate strings like..
john_doe or john and doe
01
jpg
You create a RegExp using the constructor, like: var re = RegExp(r"\d+|[a-zA-Z]+");.
You can get the substrings matching the RegExp as: var matches = [...re.allMatches(myText)];
In your case, var myText = "john_doe01.jpg";, that would give you the result ['john", "doe", "01", "jpg"].
If that's not what you want, you'll have to fiddle with the RegExp until it gives you what you do want. It usually helps writing down in prose, and in excruciating detail, what it is you want, and then writing the RegExp afterwards.
Related
I'm trying to figure out how to take a phrase and split it up into a list of separate strings based on the occurrence of certain words.
Examples are probably be the easiest way to explain what I'm hoping to achieve:
List splitters = ['ABOVE', 'AT', 'NEAR', 'IN'];
INPUT: "ALFALFA DITCH IN ECKERT CO";
OUTPUT: ["ALFALFA DITCH", "IN ECKERT CO"];
INPUT: 'ANIMAS RIVER AT DURANGO, CO';
OUTPUT: ['ANIMAS RIVER', 'AT DURANGO, CO'];
INPUT: 'ALAMOSA RIVER ABOVE WILSON CREEK IN JASPER, CO';
OUTPUT ['ALAMOSA RIVER', 'ABOVE WILSON CREEK IN JASPER, CO'];
Notice in the third example, when there are multiple occurrences of splitters in the input phrase, I only want to use the first one.
To my knowledge, the split() method doesn't support multiple strings I can't find a single example of this in dart. I would think there is a simple solution?
I'd use a RegExp then
var splitters = ['ABOVE', 'AT', 'NEAR', 'IN'];
var s = "ALFALFA DITCH IN ECKERT CO";
var splitterRE = RegExp(splitters.join('|'));
var match = splitterRE.firstMatch(s);
if (match ! null) {
var partOne = s.substring(0, match.start).trimRight();
var partTwo = s.substring(match.start);
}
That does what you ask for, but it's slightly unsafe.
It will find "IN" in "BEHIND" if given "BEHIND THE FARM IN ALABAMA".
You likely want to match only complete words. In that case, RegExps are even more helpful, since they can do that too. Change the line to:
var splitterRE = RegExp(r'\b(?:' + splitters.join('|') + r')\b');
then it will only match entire words.
I have a string containing 3 or 4 double numbers. what's the best way to extract them in an array of numbers?
First you have to find the numerals. You can use a RegExp pattern for that, say:
var doubleRE = RegExp(r"-?(?:\d*\.)?\d+(?:[eE][+-]?\d+)?");
Then you parse the resulting strings with double.parse. Something like:
var numbers = doubleRE.allMatches(input).map((m) => double.parse(m[0])).toList();
I need a regex that will only find matches where the entire string matches my query.
For instance if I do a search for movies with the name "Red October" I only want to match on that exact title (case insensitive) but not match titles like "The Hunt For Red October". Not quite sure I know how to do this. Anyone know?
Thanks!
Try the following regular expression:
^Red October$
By default, regular expressions are case sensitive. The ^ marks the start of the matching text and $ the end.
Generally, and with default settings, ^ and $ anchors are a good way of ensuring that a regex matches an entire string.
A few caveats, though:
If you have alternation in your regex, be sure to enclose your regex in a non-capturing group before surrounding it with ^ and $:
^foo|bar$
is of course different from
^(?:foo|bar)$
Also, ^ and $ can take on a different meaning (start/end of line instead of start/end of string) if certain options are set. In text editors that support regular expressions, this is usually the default behaviour. In some languages, especially Ruby, this behaviour cannot even be switched off.
Therefore there is another set of anchors that are guaranteed to only match at the start/end of the entire string:
\A matches at the start of the string.
\Z matches at the end of the string or before a final line break.
\z matches at the very end of the string.
But not all languages support these anchors, most notably JavaScript.
I know that this may be a little late to answer this, but maybe it will come handy for someone else.
Simplest way:
var someString = "...";
var someRegex = "...";
var match = Regex.Match(someString , someRegex );
if(match.Success && match.Value.Length == someString.Length){
//pass
} else {
//fail
}
Use the ^ and $ modifiers to denote where the regex pattern sits relative to the start and end of the string:
Regex.Match("Red October", "^Red October$"); // pass
Regex.Match("The Hunt for Red October", "^Red October$"); // fail
You need to enclose your regex in ^ (start of string) and $ (end of string):
^Red October$
If the string may contain regex metasymbols (. { } ( ) $ etc), I propose to use
^\QYourString\E$
\Q starts quoting all the characters until \E.
Otherwise the regex can be unappropriate or even invalid.
If the language uses regex as string parameter (as I see in the example), double slash should be used:
^\\QYourString\\E$
Hope this tip helps somebody.
Sorry, but that's a little unclear.
From what i read, you want to do simple string compare. You don't need regex for that.
string myTest = "Red October";
bool isMatch = (myTest.ToLower() == "Red October".ToLower());
Console.WriteLine(isMatch);
isMatch = (myTest.ToLower() == "The Hunt for Red October".ToLower());
You can do it like this Exemple if i only want to catch one time the letter minus a in a string and it can be check with myRegex.IsMatch()
^[^e][e]{1}[^e]$
I need to replace substrings by other strings, and there is no replace in string:, but in re:
However, in order to use re:replace, I need to quote all regex specific meta-characters like [ . etc
In ocaml, it is called Str.quote.
val quote : string -> string
Str.quote s returns a regexp string that matches exactly s and nothing else.
from http://caml.inria.fr/pub/docs/manual-ocaml/libref/Str.html
What is this function called in Erlang?
Instead of quoting regexp special characters you should consider converting your string to a binary and using binary:replace/3,4.
Found it in Elixir, called regex:escape. Converted to Erlang it looks like this (need to look into unicode and return binary flags).
escape(String) ->
re:replace(String, "[.^$*+?()[{\\\|\s#]", "\\\\&",[global]).
See Regex.escape/1 docs.
{:ok, pattern} = :re.compile(~S"[.^$*+?()[{\\\|\s#]", [:unicode])
#escape_pattern pattern
def escape(string) when is_binary(string) do
:re.replace(string, #escape_pattern, "\\\\&", [:global, {:return, :binary}])
end
I would like to use regular expression to check if my string have the format like following:
mc_834faisd88979asdfas8897asff8790ds_oa_ids
mc_834fappsd58979asdfas8897asdf879ds_oa_ids
mc_834faispd8fs9asaas4897asdsaf879ds_oa_ids
mc_834faisd8dfa979asdfaspo97asf879ds_dv_ids
mc_834faisd111979asdfas88mp7asf879ds_dv_ids
mc_834fais00979asdfas8897asf87ggg9ds_dv_ids
The format is like mc_<random string>_oa_ids or mc_<random string>_dv_ids . How can I check if my string is in either of these two formats? And please explain the regular expression. thank you.
That's a string start with mc_, while end with _oa_ids or dv_ids, and have some random string in the middle.
P.S. the random string consists of alpha-beta letters and numbers.
What I tried(I have no clue how to check the random string):
/^mc_834faisd88979asdfas8897asff8790ds$_os_ids/
Try this.
^mc_[0-9a-z]+_(dv|oa)_ids$
^ matches at the start of the line the regex pattern is applied to.
[0-9a-z] matces alphabetic and numeric chars.
+ means that there should be one or more chars in this set
(dv|oa) matches dv or oa
$ matches at the end of the string the regex pattern is applied to.
also matches before the very last line break if the string ends with a line break.
Give /\Amc_\w*_(oa|dv)_ids\z/ a try. \A is the beginning of the string, \z the end. \w* are one or more of letters, numbers and underscores and (oa|dv) is either oa or dv.
A nice and simple way to test Ruby Regexps is Rubular, might have a look at it.
This should work
/mc_834([a-z,0-9]*)_(oa|dv)_ids/g
Example: http://regexr.com?2v9q7