Why does pattern matching on this string in erlang result in a "string" for the tail and an ascii value for the list? - erlang

I was trying to write a pattern matched function in erlang like:
to_end("A") -> "Z".
The whole idea is to transform a string such as "ABC" into something different such as "ZYX" using pattern matched functions. It looks like a string is represented as a list under the hood...
I was depending on the fact that pattern matching on a "string" in erlang would result in individual string characters. But I find this:
21> F="ABC".
22> F.
"ABC"
23> [H | T]=F.
"ABC"
24> H.
65
25> T.
"BC"
Why does the head of this type of pattern matching on list always result in an ASCII value and the tail result in letters? Is there a better way to pattern match against a "list of string"?

In Erlang, strings are just a list of ascii values. It also displays lists of integers, where every integer is a printable ascii code, as strings. So [48, 49] would print out "01" since 48 corresponds to 0 and 49 to 1. Since you have the string "ABC", this is the same as [65 | [66 | [67]]], and [66, 67] will display as "BC".
If you want to write a function to pattern match on characters, you should use the character literal syntax, which is $ followed by the character. So you would write
to_end($A) -> $Z;
to_end($B) -> $Y;
to_end($C) -> $X;
...
to_end($Z) -> $A.
instead of to_end("A") -> "Z" which is the same as to_end([65]) -> [90].

Why does the head of this type of pattern matching on list always
result in an ASCII value and the tail result in letters?
In erlang, the string "ABC" is a shorthand notation for the list [65,66,67]. The head of that list is 65, and the tail of that list is the list [66,67], which the shell happens to display as "BC". Whaa??!
The shell pretty much sucks when displaying strings/lists: sometimes the shell displays a list and sometimes the shell displays a double quoted string:
2> [0, 65, 66, 67].
[0,65,66,67]
3> [65, 66, 67].
"ABC"
4>
...which is just plain dumb. Every beginning and intermediate erlang programmer gets confused by that at some point.
Just remember: when the shell displays a double quoted string, it should really be displaying a list whose elements are the character codes of each character in the double quoted string. The fact that the shell displays a double quoted string is a TERRIBLE ??feature?? of erlang, and it makes it hard to decipher what is going on in a lot of situations. You have to mentally say to yourself, "That string I'm seeing in the shell is really the list ..."
That fact that the shell displays double quoted strings for some lists really sucks when you want to display, say, a list of a person's test scores: [88, 97, 92, 70] and the shell outputs: "Xa\\F". You can use the io:format() method to get around that:
6> io:format("~w~n", [[88,97,92,70]]).
[88,97,92,70]
ok
But, if you just want to momentarily see the actual list of integers that the shell is displaying as a string, a quick and dirty method is to add the integer 0 to the head of the list:
7> Scores = [88,97,92,70].
"Xa\\F"
Huh?!!
8> [0|Scores].
[0,88,97,92,70]
Oh, okay.
The whole idea is to transform a string such as "ABC" into something
different such as "ZYX" using pattern matched functions.
Because a string is shorthand for a list of integers, you can change those integers by using addition:
-module(my).
-compile(export_all).
cipher([]) -> [];
cipher([H|T]) ->
[H+10|cipher(T)]. %% Add 10 to each character code.
In the shell:
~/erlang_programs$ erl
Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V9.3 (abort with ^G)
1> c(my).
my.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> my:cipher("ABC").
"KLM"
3>
By the way, all functions are "pattern matched", so saying "a pattern matched function" is redundant, you can just say, "a function".

Related

Erlang printing hex instead of integer

I have been trying to fix a problem for hours now, very new to erlang
lists:sublist([6,9,15,24,39,6,96],7,1).
I want this to print "100" instead of "d"
what I am i doing wrong here?
The shell is going to try to print strings as strings whenever it would be legal. That means lists of integers that happen to all be valid characters will be printed as characters, and lists that contain other things will be printed as lists:
1> [65,66,67].
"ABC"
2> [3,65,66,67].
[3,65,66,67]
But notice that I did not actually call any output functions. That was just the shell's convenience operation of implicitly echoing whatever a returned value was so you, as a programmer, can inspect it.
If I want to explicitly call an output function I should use a format string that specifies the nature of the values to be interpolated:
3> io:format("This is a list: ~tw~n", [List]).
This is a list: [65,66,67]
ok
4> io:format("This is a list rendered as an implied string: ~tp~n", [List]).
This is a list rendered as an implied string: "ABC"
ok
5> io:format("This is a string: ~ts~n", [List]).
This is a string: ABC
ok
Note the additional atom ok after each print. That is because the return value from io:format/2 is ok. So we are getting the explicit output from format/2 and then seeing its return value.
The io module doc page has the gritty details: http://erlang.org/doc/man/io.html#format-1
Back to your example...
6> lists:sublist([6,9,15,24,39,6,96],7,1).
"`"
7> io:format("~tw~n", [lists:sublist([6,9,15,24,39,6,96],7,1)]).
[96]
ok
Addendum
There is a setting called shell:strings/1 that tells the shell to turn string formatting on and off:
1> [65,66,67].
"ABC"
2> shell:strings(false).
true
3> [65,66,67].
[65,66,67]
4> <<65,66,67>>.
<<65,66,67>>
5> shell:strings(true).
false
6> <<65,66,67>>.
<<"ABC">>
But I don't mess with this setting ever anymore for a few reasons:
It is almost never worth the effort to remember this detail of the shell (convenience output from the shell is mostly useful for discovering return value structures, not specific values held by those structures -- and when you want that data you usually want strings printed as strings anyway).
It can cause surprising shell output in any case where you really are dealing with strings.
This is almost never the behavior you actually want.
When dealing with real programs you will need actual output functions using io or io_lib modules, and developing habits around format strings is much more useful than worrying over convenience output from the shell.

How do I remove a character from a list or string in Erlang?

How do I remove the character / from this list (or call it a string)
List = "/hi"
Since strings in Erlang are lists of characters, a general way to delete the first occurrence of a character from a string is to use lists:delete/2:
1> List = "/hi".
"/hi"
2> lists:delete($/, List).
"hi"
The construct $/ is the Erlang character literal for the / character.
Note that this approach works no matter where the character to be deleted is within the string:
3> List2 = "one/word".
"one/word"
4> lists:delete($/, List2).
"oneword"
Just remember that with this approach, only the first occurrence of the character is deleted. To delete all occurrences, first use string:tokens/2 to split the entire string on the given character:
5> List3 = "/this/looks/like/a/long/pathname".
"/this/looks/like/a/long/pathname"
6> Segments = string:tokens(List3, "/").
["this","looks","like","a","long","pathname"]
Note that string:tokens/2 takes its separator as a list, not just a single element, so this time our separator is "/" (or equivalently, [$/]). Our result Segments is a list of strings, which we now need to join back together. We can use either lists:flatten/1 or string:join/2 for that:
7> lists:flatten(Segments).
"thislookslikealongpathname"
8> string:join(Segments, "").
"thislookslikealongpathname"
The second argument to string:join/2 is a separator you can insert between segments, but here, we just use the empty string.
Because Erlang variables are immutable and
List = "/hi".
binds List to the expression "\hi", you cannot simply remove anything from List; in fact, you cannot alter List in any way as long as it remains bound.
What you can do instead is bind another variable, called T below, to the tail of List, like so:
1> List = "/hi".
"/hi"
2> T=tl(List).
"/hi"
3> T.
"hi"

Converting a tuple to a string in erlang language

Tuple={<<"jid">>,Member},
Tuple_in_string=lists:flatten(io_lib:format("~p", [Tuple])),
it gives output as:
"{<<\"jid\">>,\"sdfs\"}"
But i want this output without these slashes like
"{<<"jid">>,Member}"
Any pointers?
I have tried all the answers but at the end with io:format("\"~s\"~n", [Tuple_in_string]). what am geeting is "{<<"jid">>,Member}" but it is not a string.it is a atom.I need string on which i can apply concat operation.Any pointers?
You can print it like this:
io:format("\"~s\"~n", [Tuple_in_string]).
It prints:
"{<<"jid">>,"sdfs"}"
The \ are here to denote that the following " is part of the string and not a string delimiter. they do not exist in the string itself. They appear because you use the pretty print format ~p. If you use the string format ~s they wont appear in the display.
1> io:format("~p~n",["a \"string\""]).
"a \"string\""
ok
2> io:format("~s~n",["a \"string\""]).
a "string"
ok
3> length("a \"string\""). % is 10 and not 12
10
Firstly, you don't need to flatten the list here:
Tuple_in_string=lists:flatten(io_lib:format("~p", [Tuple])),
Erlang has the concept of iodata(), which means that printable things can be in nested lists and most functions can handle them, so you should leave only:
Tuple_in_string = io_lib:format("~p", [Tuple]),
Secondly, when you use ~p, you tell Erlang to print the term in such way, that it can be copied and pasted into console. That is why all double quotes are escaped \". Use ~s, which means "treat as string".
1> 38> Tuple = {<<"jid">>,"asdf"}.
{<<"jid">>,"asdf"}
2> IODATA = io_lib:format("~p", [Tuple]).
[[123,[[60,60,"\"jid\"",62,62],44,"\"asdf\""],125]]
3> io:format("~s~n", [IODATA]).
{<<"jid">>,"asdf"}
ok
L = Packet_in_tuple_form={xmlel,<<"message">>,[{<<"id">>,<<"rkX6Q-8">>},{<<"to">>,<<"multicast.devlab">>}],[{xmlel,<<"body">>,[],[{xmlcdata,"Hello"}]},{xmlel,<<"addresses">>,[{<<"xmlns">>,<<"http://jabber.org/protocol/address">>}],[{xmlel,<<"address">>,[{<<"type">>,<<"to">>},"{<<\"jid\">>,\"sds\"}",{<<"desc">>,"Description"}],[]}]}]}.
Gives me:
{xmlel,<<"message">>,
[{<<"id">>,<<"rkX6Q-8">>},{<<"to">>,<<"multicast.devlab">>}],
[{xmlel,<<"body">>,[],[{xmlcdata,"Hello"}]},
{xmlel,<<"addresses">>,
[{<<"xmlns">>,<<"http://jabber.org/protocol/address">>}],
[{xmlel,<<"address">>,
[{<<"type">>,<<"to">>},
"{<<\"jid\">>,\"sds\"}",
{<<"desc">>,"Description"}],
[]}]}]}
The \ in the address field are escape characters.
You can verify the same by checking the length of string.

How to make program keep ASCII values in erlang?

I have a function where I enter from the erlang shell:
huffman:table([$H,$E,$L,$L,$O]).
I want to keep the ASCII values like that, but mine are changed into integers in the output. How do I make the program not interpret them into integers?
Erlang doesn't distinguish characters and integers. In particular, Erlang string-literals like "HELLO" result in a list [$H, $E, $L, $L, $O]. The shell decides by a heuristic (basically checking that all integers are printable unicode characters) whether it outputs [72, 69, 76, 76, 79] or "HELLO". Here's the output in my shell session:
Erlang R16B03 (erts-5.10.4) [64-bit] [smp:4:4] [async-threads:10]
Eshell V5.10.4 (abort with ^G)
1> [$H,$E,$L,$L,$O].
"HELLO"
2>
As $H is just another way of writing the integer 72, there is no way to print it as $H built-in to Erlang. You'd have to write your own function to output the values this way.
In the example you show, it looks like you need to keep small integers as integers, while printing alphabetic values as letters. Something like this might work:
maybe_char_to_string(N) when $A =< N, N =< $Z ->
[$$, N];
maybe_char_to_string(N) ->
integer_to_list(N).
This is what it outputs:
3> foo:maybe_char_to_string($H).
"$H"
4> foo:maybe_char_to_string(1).
"1"
If you want to print something as string, use:
io:format("~s", [String]).

Why is [9] returned as "\t" in Erlang?

I am working through some Erlang tutorials and noticed when I enter
[8].
the VM returns "\b"
or if I enter
[9].
the VM returns "\t"
I am confused on why this is happening. Other numbers are returned as a list of that number:
[3].
is returned as [3]
[4].
is returned as [4], etc.
I guess the question is why is the erlang VM return it this way? Perhaps an explanation of a list [65] and a list? "A".
Another related item is confusing as well:
Type conversion, converting a list to an integer is done as:
list_to_integer("3").
Not
list_to_integer([3]).
Which returns an error
In Erlang there are no real strings. String are a list of integers. So if you give a list with integers that represent characters then they will be displayed as a string.
1> [72, 101, 108, 108, 111].
"Hello"
If you specify a list with at least element that does not have a character counterpart, then the list will be displayed as such.
2> [72, 101, 108, 108, 111, 1].
[72,101,108,108,111,1]
In Erlang strings are lists and the notation is exactly the same.
[97,98,99].
returns "abc"
The following excerpt is taken directly from "Learn You Some Erlang for Great Good!", Fred Hébert, (C)2013, No Starch Press. p. 18.
This is one of the most disliked thins in Erlang: strings. Strings are lists, and the notation is exactly the same. Why do people dislike it?
Because of this:
3> [97,98,99,4,5,6].
[97,98,99,4,5,6]
4> [233].
"é"
Erlang will print lists of numbers as numbers only when at least one of them could not also represent a letter. There is no such thing as a real string in Erlang!
"Learn You Some Erlang for Great Good!" is also available online at: http://learnyousomeerlang.com/
kadaj answered your first question. Regarding the second one about list_to_integer, if you look at the documentation, most list_to_XXX functions except binary, bitstring, and tuple consider their argument as a string. Calling them string_to_XXX could be clearer, but changing the name would break a lot of code.

Resources