What will the comprehension equivalent be for the following snippet of coad? - for-comprehension

for sentence in paragraph: for word in sentence.split(): Single_word_list.append(word)

The equivalent list comprehension is:
[Single_word_list.append(word) for word in sentence.split() for sentence in paragraph]
This arguably suffers in readability compared to nested for-loops.

[[Single_word_list.append(word) for word in sentence.split()] for sentence in paragraph]
Check out the documentation here
Edit: Here's without nested list comprehention
[Single_word_list.append(word) for sentence in paragraph for word in sentence.split()]

Related

Xtext define number of occurrence

Being new to Xtext I would like to know how to define the upper and lower boundaries regarding the occurrence of a letter.
I know of the following expressions / operators
exactly one (the default, no operator)
one or none (operator ?)
any (zero or more, operator *)
one or more (operator +)
Given the examples
<IS123A4>
<IS12>
<ISB123455>
how do I describe the grammar for the rule that after "IS" 1-25 alphanumeric letters may appear.
Currently, I have
`terminal ISCONCEPTAME : '<IS' ALPHANUM ALPHANUM? ALPHANUM? ALPHANUM?.....'>';`
`terminal ALPHANUM: ('a'..'z'|'A'..'Z'|'_'|INT);`
However, I am not sure if it is the right way to do it. I was thinking about something like
`terminal ISCONCEPTAME : '<IS' ALPHANUM{1,25} '>';`
Thanks for any input!

Prawn PDF number_with_delimiter number_to_currency? [duplicate]

As you can see from the title, I would like to write a regular expression pattern to find a string that consists of various numbers and is separated by comma every three digits. The length of string can vary.
I am still pretty new to regular expression thingy so can anyone help me with that? Thanks a lot in advance.
P.S.
Anyone could also suggest some of good resources, like website, books, etc, for learning regular expression?
This regex shall match that:
\d{1,3}(?:,\d{3})*
If you want to exclude match to a substring of an ill-formed pattern, you might want to do:
(?:\A|[^,\d])(\d{1,3}(?:,\d{3})*)(?:\z|[^,\d])
Explanation of the first regex
\d{1,3} 1 to 3 consecutive numerals
,\d{3} A comma followed by 3 consecutive numerals
(?:,\d{3})* Zero or more repetition of a non-capturing group of a comma followed by 3 consecutive numerals
Explanation of the second regex
(?:\A|[^,\d]) A non-capturing group of either the beginning of the string, or anything other than comma or numeral
(\d{1,3}(?:,\d{3})*) A capturing group of 1 to 3 consecutive numerals followed by zero or more repetition of a non-capturing group of a comma followed by 3 consecutive numerals
(?:\z|[^,\d]) A non-capturing group of either the end of the string, or anything other than comma of numeral
Try http://regexlib.com for good examples and links to tools to help you get up to speed with RegEx
Also try this regex tester app http://www.ultrapico.com/Expresso.htm
And another tool I've used before here http://osherove.com/tools

Elixir/Erlang split bitstring on newlines?

Is there a way to split a bitstring loaded from a file on newlines? I have something like this:
A line of text
Additional line of text
And another line
And I want an array like this:
["A line of text",
"Additional line of text",
"And another line"]
Is there a function to split the text on newlines to produce something like this array?
Thanks in advance.
In addition to Roberts answer.
In Elixir you can use: String.split(string, "\n")
Look at String module.
Look at binary:split/2/3 in the module binary. For example with binary:split(String, <<"\n">>).
If you simply split a string on \n, there are some serious portability problems. This is because many systems use \n, a few such as older macs use \r and Windows uses \r\n to delimit new lines.
The safer way to do it would be to use a regex to match any of the three above possibilities:String.split(str, ~r{(\r\n|\r|\n)}.
While Mark is right about the portability problems, the regex he provided has a typo in it and as a result doesn't work for \r\n sequences. Here's a simpler version that handles all 3 cases:
iex(13)> String.split("foo\nbar", ~r/\R/)
["foo", "bar"]
iex(14)> String.split("foo\rbar", ~r/\R/)
["foo", "bar"]
iex(15)> String.split("foo\r\nbar", ~r/\R/)
["foo", "bar"]
I recently run into a situation where the solution in my other answer and basically any other solution depending on regular expressions was in some situations much slower than depending on binary split, especially when limiting the amount of parts the string gets split into. You can see https://github.com/CrowdHailer/server_sent_event.ex/pull/11 for a more detailed analysis and a benchmark.
You can use :binary.split/3 even when targeting different types of new line characters:
iex(1)> "aaa\rbbb\nccc\r\nddd" |> :binary.split(["\r", "\n", "\r\n"], [:global])
["aaa", "bbb", "ccc", "ddd"]
As you can see in the above example the match is greedy and \r\n takes precedence above splitting by \r first and then \n.

Ruby - how to get rid of the last element in a string according to the following pattern?

I have these kind of strings:
A regular sentence.
A regular sentence (United Kingdom).
A regular sentence (UK).
The goal is to remove the term in the brackets, thus the desired output would be:
A regular sentence.
A regular sentence.
A regular sentence.
How to achieve this in Ruby (probably with using regular expressions?)?
Thank you
This should work:
string.gsub(/\s*\(.*\)/, '')
"A regular sentence (UK).".gsub(/\(.*\)/,"").strip #=> "A regular sentence ."
In case the sentence itself can contain parenthesis:
a = "A (very) regular sentence (UK)."
p a.gsub(/\s\([^()]*\)(?=\.\Z)/, '') #=> "A (very) regular sentence."

Language parsing in Clojure with line numbers

I have a very simply language. A function is defined as some number of comments (indicated by the line starting with a semicolon) followed by a function name (a word followed by parens), followed by anything else, and ending with a "q". Here is a parse-ez function:
(defn routine []
(multi* (regex #";.*")
(regex #"(\w+)\(.*\).*" 1)
(multi* (regex #"[^q].*"))
(regex #"q.*"))
This works, but I want to return the line numbers on which the different patterns match. Is there a way to do this or do I need to write my own parser?
As it stands right now my language is simple enough that writing a new parser wouldn't matter too much, but it will limit me as complexity increases.
There is a "line-pos" function in parse-ez. Can't you use that?
line-pos doc:
"Returns [line column] vector representing the current cursor position
of the parser"

Resources