I have one path in form of string like this Folder1/File.png
But in this string sometimes if file is hidden or folder is hidden I don't want it to be matched by my regex.
regex = %r{([a-zA-Z0-9_ -]*)\/[^.]+$}
input_path = "Folder_1/.file" # This shouldn't be matched.
input_path = "Folder/file.png" # This should be matched.
But my regex works for first input but its not even matching second one.
You are currently looking for \/[^.]+$, that is a / followed by any character except . until the end. Since the filename+extension format has a . character, it fails to match the second case.
Instead of using [^.]+$, check only that the character following / is not ., and match everything after that:
([a-zA-Z0-9_ -]*)\/[^.].*$
While there are some suggestions here that work, my suggestion would be
\/[^.][^\/\n]+$
It finds a slash, followed by anything but a dot, which in turn is followed by one, or more, of anything but a slash or a newline.
To handle the two lines given as an example,
Folder_1/.file
Folder/file.png
it takes 8 steps.
The suggested ones all work, but ([a-zA-Z0-9_ -]*)\/[^.] takes 75 steps, ([a-zA-Z0-9_ -]*)\/[^.]+\.[^.]+\z 78 steps and ([a-zA-Z0-9_ -]*)\/[^.].*$ takes 77 steps.
This may be totally irrelevant and I may have missed some angle, but I wanted to mention it ;)
Se it here at regex101.
regex = %r{([a-zA-Z0-9_ -]*)\/[^.]}
Related
Hi I've been struggling with this for the last hour and am no closer. How exactly do I strip everything except numbers, commas and decimal points from a rails string? The closest I have so far is:-
rate = rate.gsub!(/[^0-9]/i, '')
This strips everything but the numbers. When I try add commas to the expression, everything is getting stripped. I got the aboves from somewhere else and as far as I can gather:
^ = not
Everything to the left of the comma gets replaced by what's in the '' on the right
No idea what the /i does
I'm very new to gsub. Does anyone know of a good tutorial on building expressions?
Thanks
Try:
rate = rate.gsub(/[^0-9,\.]/, '')
Basically, you know the ^ means not when inside the character class brackets [] which you are using, and then you can just add the comma to the list. The decimal needs to be escaped with a backslash because in regular expressions they are a special character that means "match anything".
Also, be aware of whether you are using gsub or gsub!
gsub! has the bang, so it edits the instance of the string you're passing in, rather than returning another one.
So if using gsub! it would be:
rate.gsub!(/[^0-9,\.]/, '')
And rate would be altered.
If you do not want to alter the original variable, then you can use the version without the bang (and assign it to a different var):
cleaned_rate = rate.gsub!(/[^0-9,\.]/, '')
I'd just google for tutorials. I haven't used one. Regexes are a LOT of time and trial and error (and table-flipping).
This is a cool tool to use with a mini cheat-sheet on it for ruby that allows you to quickly edit and test your expression:
http://rubular.com/
You can just add the comma and period in the square-bracketed expression:
rate.gsub(/[^0-9,.]/, '')
You don't need the i for case-insensitivity for numbers and symbols.
There's lots of info on regular expressions, regex, etc. Maybe search for those instead of gsub.
You can use this:
rate = rate.gsub!(/[^0-9\.\,]/g,'')
Also check this out to learn more about regular expressions:
http://www.regexr.com/
I have been coding a program in Lua that automatically formats IRC logs from a roleplay. In the roleplay logs there is a specific guideline for "Out of character" conversation, which we use double parentheses for. For example: ((<Things unrelated to roleplay go here>)). I have been trying to have my program remove text between double brackets (and including both brackets). The code is:
ofile = io.open("Output.txt", "w")
rfile = io.open("Input.txt", "r")
p = rfile:read("*all")
w = string.gsub(p, "%(%(.*?%)%)", "")
ofile:write(w)
The pattern here is > "%(%(.*?%)%)" I've tried multiple variations of the pattern. All resulted in fruitless results:
1. %(%(.*?%)%) --Wouldn't do anything.
2. %(%(.*%)%) --Would remove *everything* after the first OOC message.
Then, my friend told me that prepending the brackets with percentages wouldn't work, and that I had to use backslashes to 'escape' the parentheses.
3. \(\(.*\)\) --resulted in the output file being completely empty.
4. (\(\(.*\)\)) --Same result as above.
5. (\(\(.*?\)\) --would for some reason, remove large parts of the text for no apparent reason.
6. \(\(.*?\)\) --would just remove all the text except for the last line.
The short, absolute question:
What pattern would I need to use to remove all text between double parentheses, and remove the double parentheses themselves too?
You're friend is thinking of regular expressions. Lua patterns are similar, but different. % is the correct escape character.
Your pattern should be %(%(.-%)%). The - is similar to * in that it matches any number of the preceding sequence, but while * tries to match as many characters as it can (it's greedy), - matches the least amount of characters possible (it's non-greedy). It won't go overboard and match extra double-close-parenthesis.
I can't figure out what does this regex match:
A: "\\/\\/c\\/(\\d*)"
B: "\\/\\/(\\d*)"
I suppose they are matching some kind of number sequence since \d matches any digit but I'd like to know an example of a string that would be a match for this regex.
The pattern syntax is that specified by ICU. Expressions are created with NSRegularExpression in an iOS app and are correct.
The first matches //c/ + 0 or more digits. The second matches // + 0 or more digits. In both the digits are captured.
An example of a match for A) is //c/123
An example of a match for B) is //12345
When I use Cygwin which emulates Bash on Windows, I sometimes run into situations where I have to escape my escape characters which is what I think is making this expression look so weird. For instance, when I use sed to look for a single '\' I sometimes have to write it as '\\\\'. (Funny, StackOverflow proved my point. If you write 4 backslashes in the comment, it only shows two. So if you process it again, they might all disappear depending on your situation).
Considering this, it might be helpful to think of pairs of backslashes as representing only one if you're coming from a similar situation. My guess would be you are. Because of this I would say Erik Duymelinck is probably spot on. This will capture a sequence of digits that may or may not follow a couple slashes and a c:
//c/000
//00000
This regex matches an odd sequence of characters, which, at first glance, almost seem like a regex, since \d is a digit, and followed by an asterisk (\d*) would mean zero-or-more digits. But it's not a digit, because the escape-slash is escaped.
\\/\\/c\\/(\\d*)
So, for instance, this one matches the following text:
\/\/c\/\
\/\/c\/\d
\/\/c\/\dd
\/\/c\/\ddd
\/\/c\/\dddd
\/\/c\/\ddddd
\/\/c\/\dddddd
...
This one is almost the same
\\/\\/(\\d*)
except you just delete the c\/ from the above results:
\/\/\
\/\/\d
\/\/\dd
\/\/\ddd
\/\/\dddd
\/\/\ddddd
\/\/\dddddd
...
In both cases, the final \ and optional d is [capture group][1] one.
My first impression was that these regexes were intended for escaping in Java strings, meaning they would be completely invalid. If the were escaped for Java strings, such as
Pattern p = Pattern.compile("\\/\\/c\\/(\\d*)");
It would be invalid, because after un-escaping, it would result in this invalid regex:
\/\/c\/(\d*)
The single escape-slashes (\) are invalid. But the \d is valid, as it would mean any digit.
But again, I don't think they're invalid, and they're not escaped for a Java string. They're just odd.
text.scan(/\"[\d\w\s\+\-\*\/]*\"/)
I'm simply looking to find any thing within quotations that can contain letters, numbers, spaces, plus, minus, star, or forward slash. Everything works great in console. Each of the following works in a browser:
"abc"
"123"
"x-1" or "x - 1"
"x/1" or "x / 1"
But the plus sign and star fail in a browser (despite working fine in console with the same regex). Does anyone have any ideas?
Edit #1: I'm performing a quick gsub to add some formatting to the results of the scan. If the quotations have a plus or star in them, they don't even get picked up by the scan. The same code and text pasted in console works just fine.
Edit #2: I figured out a better way to frame this question without extraneous details and got the answer. "Why can't I perform a gsub on each of the results from a scan if the result contains regex special characters?"
Turned out that this problem was related to regexp string insertion (/#{whatever}/) not escaping special characters - manually escaping clears it up (/#{Regexp.escape(whatever)}/). See this question for a full example/explanation.
I don't know what do you mean "work in browser" but I'm making an assumption that you're trying to parse an URL. In URL the + & * signs can be converted to %2B & %2A respectively.
Try this regexp:
/"[(\d\w\s\+\-\*\/|%2B|%2A)]+"/
...or decode URL before parsing.
How do a I create a validator, which has these simple rules. An expression is valid if
it must start with a letter
it must end with a letter
it can contain a dash (minus sign), but not at start or end of the expression
^[a-zA-Z]+-?[a-zA-Z]+$
E.g.
def validate(whatever)
reg = /^[a-zA-Z]+-?[a-zA-Z]+$/
return (reg.match(whatever)) ? true : false;
end
/^[A-Za-z]+(-?[A-Za-z]+)?$/
this seems like what you want.
^ = match the start position
^[A-Za-z]+ = start position is followed by any at least one or more letters.
-? = is there zero or one hyphens (use "*" if there can be multiple hyphens in a row).
[A-Za-z]+ = hyphen is followed by one or more letters
(-?[A-Za-z]+)? = for the case that there is a single letter.
$= match the end position in the string.
xmammoth pretty much got it, with one minor problem. My solution is:
^[a-zA-Z]+\-?[a-zA-Z]+$
Note that the original question states, it can contain a dash. The question mark is needed after the dash to make sure that it is optional in the regex.
^[A-Za-z].*[A-Za-z]$
In other words: letter, anything, letter.
Might also want:
^[A-Za-z](.*[A-Za-z])?$
so that a single letter is also matched.
What I meant, to be able to create tags. For example: "Wild-things" or "something-wild" or "into-the-wild" or "in-wilderness" "my-wild-world" etc...
This regular expression matches sequences, that consist of one or more words of letters, that are concatenated by dashes.
^[a-zA-Z]+(?:-[a-zA-Z]+)*$
Well,
[A-Za-z].*[A-Za-z]
According to your rules that will work. It will match anything that:
starts with a letter
ends with a letter
can contain a dash (among everything else) in between.