I never wrote any complex regular expression before, and what I need seems to be (at least) a bit complicated.
I need a Regex to find matches for the following:
"On Fri, Jan 16, 2015 at 4:39 PM"
Where On will always be there;
then 3 characters for week day;
, is always there;
space is always there;
then 3 characters for month name;
space is always there;
day of month (one or two numbers);
, is always there;
space is always there;
4 numbers for year;
space at space always there;
time (have to match 4:39 as well as 10:39);
space and 2 caps letters for AM or PM.
Here's a very simple and readable one:
/On \w{3}, \w{3} \d{1,2}, \d{4} at \d{1,2}:\d{2} [AP]M/
See it on rubular
Try this:
On\s+(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun), (?:Jan|Feb|Mar|Apr|May|June|July|Aug|Sept|Oct|Nov|Dec) \d{1,2}, \d{4} at \d{1,2}:\d{2} (?:AM|PM)
/On \w{3}, \w{3} \d{1,2}, \d{4} at \d{1,2}:\d{1,2} [A-Z]{2}/
# \w{3} for 3 charecters
# \d{1,2} for a or 2 digits
# \d{4} for 4 digits
# [A-Z]{2} for 2 capital leters
You could try the below regex and it won't check for the month name or day name or date.
^On\s[A-Z][a-z]{2},\s[A-Z][a-z]{2}\s\d{1,2},\s\d{4}\sat\s(?:10|4):39\s[AP]M$
DEMO
You can use Rubular to construct and test Ruby Regular Expressions.
I have put together an Example: http://rubular.com/r/45RIiwheqs
Since it looks you try to parse dates, you should use Date.strptime.
/On [A-Za-z]{3}, [A-Za-z]{3} \d{1,2}, \d{4} at \d{1,2}:\d{1,2}/g
The way you are describing the problem makes me thing that the format will always be preserved.
I would then in your case use the Time.parse function, passing the format string
format = "On %a, %b"On Fri, Jan 16, 2015 at 4:39 PM", format)
which is more readable than a regexp (in my opinion) and has the added value that it returns a Time object, which is easier to use than a regexp match, in case you need to perform other time-based calculations.
Another good thing is that if the string contains an invalid date (like "On Mon, Jan 59, 2015 at 37:99 GX" ) the parse function will raise an exception, so that validation is done for free for you.
Related
Cisco IOS routers, doing a "dir", and I want to grab all file names with ".bin" in the name.
Example string:
Directory of flash0:/
1 -rw- 95890300 May 24 2015 11:27:22 +00:00 c2900-universalk9-mz.SPA.153-3.M5.bin
2 -rw- 68569216 Feb 8 2019 20:15:26 +00:00 c3900e-universalk9-mz.SPA.151-4.M10.bin
3 -rw- 46880 Oct 25 2017 19:08:56 +00:00 pdcamadeusrtra-cfg
4 -rw- 600 Feb 1 2019 19:36:44 +00:00 vlan.dat
260153344 bytes total (95637504 bytes free)
I've figured out how to pull "bin", but I can't figure out how to pull the whole filename (starting with " c", ending in "bin"), because I want to then use the values and delete unwanted files.
I'm new to programming, so the regex examples are a little confusing.
You can use this regex
^[\w\W]+?(?=(c.*\.bin))\1$
^ - Start of string.
[\w\W]+? - Match anything one or more time ( Lazy mode ).
(?=(c.*\.bin)) - Positive lookahead match c followed by anything followed by \.bin ( Group 1)
\1 - Match group 1.
$ - End of string.
Demo
To match the filename that start with a c (or at the start of the string) you might use a negative lookbehind (?<!\S) to check what is on the left is not a non-whitespace character.
Then match either 1+ times not a whitespace character \S+ or list in a character class [\w.-]+ what the allowed characters are to match. After that match a dot \. followed by bin.
At the end you might use a word boundary \b to prevent bin being part of a larger word:
(?<!\S)[\w.-]+\.bin\b
regex101 demo
Thank you Code Maniac!
Your code finds one instance, and I needed to find all. Using what you gave me plus messing around with some other examples, I found this to work:
binfiles="{{ dir_response.stdout[0] | regex_findall('\b(?=(c.*.bin))\b') }}"
Now I get this:
TASK [set_fact] ********************************************************************************************************
task path: /export/home/e130885/playbooks/ios-switch-upgrade/ios_clean_flash.yml:16
Tuesday 12 February 2019 08:29:58 -0600 (0:00:00.350) 0:00:03.028 ******
ok: [10.35.91.200] => changed=false
ansible_facts:
binfiles:
- c2900-universalk9-mz.SPA.153-3.M5.bin
- c3900e-universalk9-mz.SPA.151-4.M10.bin
- c2800nm-adventerprisek9-mz.151-4.M12a.bin
Onto the next task of figuring out how to use each element. Thank you!
I couldn't really clarify what I'm asking in the title. I an integer for a day and a month. I have to print the month with a 0 in front of it if it's one digit only.
For example 04 if month = 4 and so on.
This is how it's supposed to look like in C#:
Console.WriteLine("{0}.{1:00}", day, month);
Thank you.
int month = 4;
DecimalFormat formater = new DecimalFormat("00");
String month_formated = formater.format(month);
Besides the answer Fernando Lahoz provided (which is pretty specific to your case: decimal formating) you can also use System.out.format in Java which allows you to specify a format-string while printing to System.out (the format function is applicable to any PrintStream though). In your case
System.out.format("%2d %2d", day, month)
should do the trick. The %dis used for decimal integers and you can then specify any width you want just before the 'd' (2 in your case).
If you want to access the string formed for later use and not (only) print it you can use String.format. It uses the same format as System.out.format but returns the String that is formed.
A complete syntax for all formats(string, decimal, floating point, calendar, date/time, ...) can be found here.
If you'd like a quick tuto on number-formatting you can check this link or this link instead.
Good luck!
I would like to check whether a year was found within a string. Something like
if string.scan(/\d{4}/).first == TRUE
for example a string looks like "there were 3 earthquakes in 2007"
Any suggestions?
If you want to match standalone 4 digit string, you may consider a regex with word boundaries:
!('It is 2016 now.' =~ /\b\d{4}\b/).nil? # => true
or - a more real world sample usage:
if string =~ /\b\d{4}\b/
The \b\d{4}\b matches any 4 digits that are not preceded nor followed with word characters (digits, letters or underscore), so there will be no match in 02312345.
Also, in case you want to precise to current century, or the 20th century, you may use /\b(?:19|20)\d{2}\b/ regex.
To extract the digits, use s[/\b\d{4}\b/].
'It was in 2015/16.'[/\b\d{4}\b/] # => 2015
See the Ruby demo
I'm having trouble with semantic predicates in ANTLR 4. My grammar is syntactically ambiguous, and needs to look ahead one token to resolve the ambiguity.
As an example, I want to parse "Jan 19, 2012 until 9 pm" as the date "Jan 19, 2012" leaving parser's next token at "until". And I want to parse "Jan 19, 7 until 9 pm" as the date "Jan. 19" with parser's next token at "7".
So I need to look at the 3rd token and either take it or leave it.
My grammar fragment is:
date
: month d=INTEGER { isYear(getCurrentToken().getText())}? y=INTEGER
{//handle date, use $y for year}
| month d=INTEGER {//handle date, use 2013 for year}
;
When the parser runs on either sample input, I get this message:
line 1:9 rule date failed predicate: { isYear(getCurrentToken().getText())}?
It never gets to the 2nd rule alternative, because (I'm guessing) it's already read one extra token.
Can someone show me how to accomplish this?
In parser rules, ANTLR 4 only uses predicates on the left edge when making a decision. Inline predicates like the one you showed above are only validated.
The following modification will cause ANTLR to evaluate the predicate while it makes the decision, but obviously you'll need to modify it to use the correct lookahead token instead of calling getCurrentToken().
date
: {isYear(getCurrentToken().getText())}? month d=INTEGER y=INTEGER
{//handle date, use $y for year}
| month d=INTEGER {//handle date, use 2013 for year}
;
PS: If month is always exactly one token long, then _input.LT(3) should provide the token you want.
I would like to return the local time as string but with leading zeros. I tried this:
{{Year, Month, Day}, {Hour, Minute, Second}} = erlang:localtime().
DateAsString = io_lib:format("~2.10.0B~2.10.0B~4.10.0B~2.10.0B~2.10.0B~2.10.0B",
[Month, Day, Year, Hour, Minute, Second]).
But if some of the components is one digit, the returned string is:
[["0",57],"29","2011","17","33","34"]
The current month 9 is printed as ["0",57].
Please, help.
Thank you.
Try:
1> lists:flatten([["0",57],"29","2011","17","33","34"]).
"09292011173334"
io_lib:format/2 (and it's companion io:format/2) actually returns a deep IO list. Such a list is printable and can be sent on a socket or written to a file just as a flat string, but is more efficient to produce. Flattening is often useless, because in all cases where the string will be printed or output to a file/socket it will automatically be flattened by Erlang.
You want to be using something like this:
DateAsString = io_lib:format("~2..0w~2..0w~4..0w~2..0w~2..0w~2..0w",
[Month, Day, Year, Hour, Minute, Second]).
The more common w format modifier does the right thing here, what with base and such, so there's no need to use the more complex B modifier. 2..0 says "2 characters wide, zero padded, no precision specified." We don't need precision here, since we're dealing with integers.