Erlang: Strange chars in a generated list - erlang

Trying to generate a list through comprehension and at some point I start seeing strange character strings. Unable to explain their presence at this point (guessing the escape chars to be ASCII codes - but why?):
45> [[round(math:pow(X,2))] ++ [Y]|| X <- lists:seq(5,10), Y <- lists:seq(5,10)].
[[25,5],
[25,6],
[25,7],
[25,8],
[25,9],
[25,10],
[36,5],
[36,6],
[36,7],
"$\b","$\t","$\n",
[49,5],
[49,6],
[49,7],
"1\b","1\t","1\n",
[64,5],
[64,6],
[64,7],
"#\b","#\t","#\n",
[81,5],
[81,6],
[81,7],
"Q\b",
[...]|...]

In Erlang all strings are just list of small integers (like chars in C). And shell to help you out a little tries to interpret any list as printable string. So what you get are numbers, they are just printed in a way you would not expect.
If you would like to change this behaviour you can look at this answer.

Related

wxMaxima: how to use texput to tell tex1 how to handle strings?

tex1() seems to return all strings as follow:
tex1(hello);
{\it hello}
tex1("hello");
\mbox{ hello }
What variable must one use to change this handling via texput? e.g. if I would just like it to print strings literally? I'm using other Maxima commands (like printf and concat to produce strings that are then passed to tex1, and occasionally the default handling is causing issues.
I tried texput(""", ...) and texput("''", ...); the first wasn't accepted, the 2nd was, but did not change the output. I really have no clue for the non-quoted strings.
Let's be careful to distinguish symbols from strings. When you enter tex1(hello) then hello is a symbol, and when you enter tex1("hello") then "hello" is a string. Symbols are essentially names for items in a lookup table, which can store additional info (symbol properties) for each. Strings on the other hand are just (from Maxima's point of view) just a sequence of characters.
Anyway changing the output for all symbols or all strings is unfortunately not possible via texput. But with a one-line Lisp function, one can accomplish it. Try this: for symbols,
:lisp (defun tex-stripdollar (sym) (maybe-invert-string-case (symbol-name (stripdollar sym))))
and for strings,
:lisp (defun tex-string (str) str)
These are going to change some existing outputs, so you'll want to try it and see if it works for you.

A roadblock working with erlang's List

I was working with list in erlang which is filled with a single value each time and I wanted to modify this list by multiplying its value with 10. But when I tried this the following thing happened:
E=[4*10].
"("
I searched the ascii table and found that ascii value 40 is stored for the symbol "(" only.
Can anybody trow some light on it and also tell me how I can get E=[40] by performing the multiplication inside the List only?
Strings are represented as lists of bytes in Erlang and thus saying "(" it's exactly the same as [40].
It's just a syntactic sugar. Every time Erlang displays a list, if it contains "displayable" ASCII characters it will display the string instead of the list of numbers.
You can user format to control de display:
io:format("Number ~w is character ~c\n", [40 40]).

Lua pattern help (Double parentheses)

I have been coding a program in Lua that automatically formats IRC logs from a roleplay. In the roleplay logs there is a specific guideline for "Out of character" conversation, which we use double parentheses for. For example: ((<Things unrelated to roleplay go here>)). I have been trying to have my program remove text between double brackets (and including both brackets). The code is:
ofile = io.open("Output.txt", "w")
rfile = io.open("Input.txt", "r")
p = rfile:read("*all")
w = string.gsub(p, "%(%(.*?%)%)", "")
ofile:write(w)
The pattern here is > "%(%(.*?%)%)" I've tried multiple variations of the pattern. All resulted in fruitless results:
1. %(%(.*?%)%) --Wouldn't do anything.
2. %(%(.*%)%) --Would remove *everything* after the first OOC message.
Then, my friend told me that prepending the brackets with percentages wouldn't work, and that I had to use backslashes to 'escape' the parentheses.
3. \(\(.*\)\) --resulted in the output file being completely empty.
4. (\(\(.*\)\)) --Same result as above.
5. (\(\(.*?\)\) --would for some reason, remove large parts of the text for no apparent reason.
6. \(\(.*?\)\) --would just remove all the text except for the last line.
The short, absolute question:
What pattern would I need to use to remove all text between double parentheses, and remove the double parentheses themselves too?
You're friend is thinking of regular expressions. Lua patterns are similar, but different. % is the correct escape character.
Your pattern should be %(%(.-%)%). The - is similar to * in that it matches any number of the preceding sequence, but while * tries to match as many characters as it can (it's greedy), - matches the least amount of characters possible (it's non-greedy). It won't go overboard and match extra double-close-parenthesis.

What does these two regex match?

I can't figure out what does this regex match:
A: "\\/\\/c\\/(\\d*)"
B: "\\/\\/(\\d*)"
I suppose they are matching some kind of number sequence since \d matches any digit but I'd like to know an example of a string that would be a match for this regex.
The pattern syntax is that specified by ICU. Expressions are created with NSRegularExpression in an iOS app and are correct.
The first matches //c/ + 0 or more digits. The second matches // + 0 or more digits. In both the digits are captured.
An example of a match for A) is //c/123
An example of a match for B) is //12345
When I use Cygwin which emulates Bash on Windows, I sometimes run into situations where I have to escape my escape characters which is what I think is making this expression look so weird. For instance, when I use sed to look for a single '\' I sometimes have to write it as '\\\\'. (Funny, StackOverflow proved my point. If you write 4 backslashes in the comment, it only shows two. So if you process it again, they might all disappear depending on your situation).
Considering this, it might be helpful to think of pairs of backslashes as representing only one if you're coming from a similar situation. My guess would be you are. Because of this I would say Erik Duymelinck is probably spot on. This will capture a sequence of digits that may or may not follow a couple slashes and a c:
//c/000
//00000
This regex matches an odd sequence of characters, which, at first glance, almost seem like a regex, since \d is a digit, and followed by an asterisk (\d*) would mean zero-or-more digits. But it's not a digit, because the escape-slash is escaped.
\\/\\/c\\/(\\d*)
So, for instance, this one matches the following text:
\/\/c\/\
\/\/c\/\d
\/\/c\/\dd
\/\/c\/\ddd
\/\/c\/\dddd
\/\/c\/\ddddd
\/\/c\/\dddddd
...
This one is almost the same
\\/\\/(\\d*)
except you just delete the c\/ from the above results:
\/\/\
\/\/\d
\/\/\dd
\/\/\ddd
\/\/\dddd
\/\/\ddddd
\/\/\dddddd
...
In both cases, the final \ and optional d is [capture group][1] one.
My first impression was that these regexes were intended for escaping in Java strings, meaning they would be completely invalid. If the were escaped for Java strings, such as
Pattern p = Pattern.compile("\\/\\/c\\/(\\d*)");
It would be invalid, because after un-escaping, it would result in this invalid regex:
\/\/c\/(\d*)
The single escape-slashes (\) are invalid. But the \d is valid, as it would mean any digit.
But again, I don't think they're invalid, and they're not escaped for a Java string. They're just odd.

Regular expression in Ruby

Could anybody help me make a proper regular expression from a bunch of text in Ruby. I tried a lot but I don't know how to handle variable length titles.
The string will be of format <sometext>title:"<actual_title>"<sometext>. I want to extract actual_title from this string.
I tried /title:"."/ but it doesnt find any matches as it expects a closing quotation after one variable from opening quotation. I couldn't figure how to make it check for variable length of string. Any help is appreciated. Thanks.
. matches any single character. Putting + after a character will match one or more of those characters. So .+ will match one or more characters of any sort. Also, you should put a question mark after it so that it matches the first closing-quotation mark it comes across. So:
/title:"(.+?)"/
The parentheses are necessary if you want to extract the title text that it matched out of there.
/title:"([^"]*)"/
The parentheses create a capturing group. Inside is first a character class. The ^ means it's negated, so it matches any character that's not a ". The * means 0 or more. You can change it to one or more by using + instead of *.
I like /title:"(.+?)"/ because of it's use of lazy matching to stop the .+ consuming all text until the last " on the line is found.
It won't work if the string wraps lines or includes escaped quotes.
In programming languages where you want to be able to include the string deliminator inside a string you usually provide an 'escape' character or sequence.
If your escape character was \ then you could write something like this...
/title:"((?:\\"|[^"])+)"/
This is a railroad diagram. Railroad diagrams show you what order things are parsed... imagine you are a train starting at the left. You consume title:" then \" if you can.. if you can't then you consume not a ". The > means this path is preferred... so you try to loop... if you can't you have to consume a '"' to finish.
I made this with https://regexper.com/#%2Ftitle%3A%22((%3F%3A%5C%5C%22%7C%5B%5E%22%5D)%2B)%22%2F
but there is now a plugin for Atom text editor too that does this.

Resources