How to return a formatted string in Erlang? - erlang

Assume you're coding in golang, you can do something like:
str := fmt.Sprintf("%d is bigger than %d", 6, 4)
How about Erlang?

The Erlang equivalent would be
Str = io_lib:format("~p is bigger than ~p", [6, 4])
Note that, even if the result may be not technically a string, normally there is no need to convert it to the string by calling lists:flatten. The result of the format function usually is a special case of iolist. Virtually all Erlang functions expecting a string accept iolists as arguments as well.
"Usually" above means "if Unicode modifier is not used in the format string". In most cases there is no need to use Unicode modifiers, and the result of format can be used directly as described above.

There is io_lib:format/2, that does the job, but note that it returns a possibly nested list of chars, not a string. For a proper string, you have to flatten/1 it afterwards:
lists:flatten(io_lib:format("~p is bigger than ~p", [6, 4]))

To use io_lib:format/2 with unicode characters:
50> X = io_lib:format("~s is greater than ~s", [[8364], [36]]).
** exception error: bad argument
in function io_lib:format/2
called as io_lib:format("~s is greater than ~s",[[8364],"$"])
51> X = io_lib:format("~ts is greater than ~s", [[8364], [36]]).
[[8364],
32,105,115,32,103,114,101,97,116,101,114,32,116,104,97,110,
32,"$"]
52> io:format("~s~n", [X]).
** exception error: bad argument
in function io:format/2
called as io:format("~s~n",
[[[8364],
32,105,115,32,103,114,101,97,116,101,114,32,116,
104,97,110,32,"$"]])
*** argument 1: failed to format string
53> io:format("~ts~n", [X]).
€ is greater than $
ok

Related

Erlang equivalent of javascript codePointAt?

Is there an erlang equivalent of codePointAt from js? One that gets the code point starting at a byte offset, without modifying the underlying string/binary?
You can use bit syntax pattern matching to skip the first N bytes and decode the first character from the remaining bytes as UTF-8:
1> CodePointAt = fun(Binary, Offset) ->
<<_:Offset/binary, Char/utf8, _/binary>> = Binary,
Char
end.
Test:
2> CodePointAt(<<"πr²"/utf8>>, 0).
960
3> CodePointAt(<<"πr²"/utf8>>, 1).
** exception error: no match of right hand side value <<207,128,114,194,178>>
4> CodePointAt(<<"πr²"/utf8>>, 2).
114
5> CodePointAt(<<"πr²"/utf8>>, 3).
178
6> CodePointAt(<<"πr²"/utf8>>, 4).
** exception error: no match of right hand side value <<207,128,114,194,178>>
7> CodePointAt(<<"πr²"/utf8>>, 5).
** exception error: no match of right hand side value <<207,128,114,194,178>>
As you can see, if the offset is not in a valid UTF-8 character boundary, the function will throw an error. You can handle that differently using a case expression if needed.
First, remember that only binary strings are using UTF-8 in Erlang. Plain double-quote strings are already just lists of code points (much like UTF-32). The unicode:chardata() type represents both of these kinds of strings, including mixed lists like ["Hello", $\s, [<<"Filip"/utf8>>, $!]]. You can use unicode:characters_to_list(Chardata) or unicode:characters_to_binary(Chardata) to get a flattened version to work with if needed.
Meanwhile, the JS codePointAt function works on UTF-16 encoded strings, which is what JavaScript uses. Note that the index in this case is not a byte position, but the index of the 16-bit units of the encoding. And UTF-16 is also a variable length encoding: code points that need more than 16 bits use a kind of escape sequence called "surrogate pairs" - for example emojis like 👍 - so if such characters can occur, the index is misleading: in "a👍z" (in JavaScript), the a is at 0, but the z is not at 2 but at 3.
What you want is probably what's called the "grapheme clusters" - those that look like a single thing when printed (see the docs for Erlang's string module: https://www.erlang.org/doc/man/string.html). And you can't really use numerical indexes to dig the grapheme clusters out from a string - you need to iterate over the string from the start, getting them out one at a time. This can be done with string:next_grapheme(Chardata) (see https://www.erlang.org/doc/man/string.html#next_grapheme-1) or if you for some reason really need to index them numerically, you could insert the individual cluster substrings in an array (see https://www.erlang.org/doc/man/array.html). For example: array:from_list(string:to_graphemes(Chardata)).

what does to_float in Erlang return ,what is the structure of the tuple it returns?

Recently I came across a code snippet which take string as input and returns float value but I'm getting confused with lines inside 3 and 4 construct. Anyone please, explain it.
as_number(S) ->
case string:to_float(S) of
{error, no_float} -> list_to_integer(S);
{N, _} -> N
end.
The function string:to_float takes in a string (which is a list in erlang) and tries to convert it to a float. It expects valid text that represents a float (ASCII digits), followed by the rest of the string. The return is a tuple of {Float, Rest} or {error, Reason}, Rest is the remaining part of the string which is not ASCII digits. In this instance, if the string can not be converted to a float, it tries to convert the list to an integer, which may not work, depending on the contents of the string.

Converting a tuple to a string in erlang language

Tuple={<<"jid">>,Member},
Tuple_in_string=lists:flatten(io_lib:format("~p", [Tuple])),
it gives output as:
"{<<\"jid\">>,\"sdfs\"}"
But i want this output without these slashes like
"{<<"jid">>,Member}"
Any pointers?
I have tried all the answers but at the end with io:format("\"~s\"~n", [Tuple_in_string]). what am geeting is "{<<"jid">>,Member}" but it is not a string.it is a atom.I need string on which i can apply concat operation.Any pointers?
You can print it like this:
io:format("\"~s\"~n", [Tuple_in_string]).
It prints:
"{<<"jid">>,"sdfs"}"
The \ are here to denote that the following " is part of the string and not a string delimiter. they do not exist in the string itself. They appear because you use the pretty print format ~p. If you use the string format ~s they wont appear in the display.
1> io:format("~p~n",["a \"string\""]).
"a \"string\""
ok
2> io:format("~s~n",["a \"string\""]).
a "string"
ok
3> length("a \"string\""). % is 10 and not 12
10
Firstly, you don't need to flatten the list here:
Tuple_in_string=lists:flatten(io_lib:format("~p", [Tuple])),
Erlang has the concept of iodata(), which means that printable things can be in nested lists and most functions can handle them, so you should leave only:
Tuple_in_string = io_lib:format("~p", [Tuple]),
Secondly, when you use ~p, you tell Erlang to print the term in such way, that it can be copied and pasted into console. That is why all double quotes are escaped \". Use ~s, which means "treat as string".
1> 38> Tuple = {<<"jid">>,"asdf"}.
{<<"jid">>,"asdf"}
2> IODATA = io_lib:format("~p", [Tuple]).
[[123,[[60,60,"\"jid\"",62,62],44,"\"asdf\""],125]]
3> io:format("~s~n", [IODATA]).
{<<"jid">>,"asdf"}
ok
L = Packet_in_tuple_form={xmlel,<<"message">>,[{<<"id">>,<<"rkX6Q-8">>},{<<"to">>,<<"multicast.devlab">>}],[{xmlel,<<"body">>,[],[{xmlcdata,"Hello"}]},{xmlel,<<"addresses">>,[{<<"xmlns">>,<<"http://jabber.org/protocol/address">>}],[{xmlel,<<"address">>,[{<<"type">>,<<"to">>},"{<<\"jid\">>,\"sds\"}",{<<"desc">>,"Description"}],[]}]}]}.
Gives me:
{xmlel,<<"message">>,
[{<<"id">>,<<"rkX6Q-8">>},{<<"to">>,<<"multicast.devlab">>}],
[{xmlel,<<"body">>,[],[{xmlcdata,"Hello"}]},
{xmlel,<<"addresses">>,
[{<<"xmlns">>,<<"http://jabber.org/protocol/address">>}],
[{xmlel,<<"address">>,
[{<<"type">>,<<"to">>},
"{<<\"jid\">>,\"sds\"}",
{<<"desc">>,"Description"}],
[]}]}]}
The \ in the address field are escape characters.
You can verify the same by checking the length of string.

Erlang - io:format 's result / (formatting with io_lib:format/2)

I'm trying to get the result of the output of io:format/1.
I know that there's a similar function in io_lib, io_lib:format/2, but the output is different. In fact, it doesn't do anything at all.
If I try to bound io:format, ok is bounded, and the formatted string is written out to the console.
So my question is, how can I get the same output with io_lib:format/2?
Or how can I bound the formatted string to a variable?
1> A = io:get_line('> ').
> "test".
"\"test\".\n"
2> io:format(A).
"test".
ok
3> B = io_lib:format(A, []).
"\"test\".\n"
4> B.
"\"test\".\n"
5> C = io:format(A).
"test".
ok
6> C.
ok
io_lib:format is not an output function the way io:format is. Instead io_lib:format only returns the value, but does not output it.
The result of io:format that you see as "test." is the rendered version as sent to the terminal (including the newline) , then it returns ok. Conversely, the return value of io_lib:format that you see as "\"test\".\n" is simply the erlang shell's representation of the same string, with the quotes and newline escaped, and surrounded by its own quotes.
io_lib:format is more commonly used for inserting values into the strings (similar to C's printf functions). For example, doing something like
NewString = io_lib:format("The string entered was ~s I hope you like it",[A])
The value of NewString would be
The string entered was "test".
I hope you like it
For which the Erlang Shell's representation would be:
"The string entered was \"test\".\n I hope you like it"
If all you want to do is output the value you just entered, then io:format is sufficient for your needs.

splitting of binaries

I tried to split two fields from a binary string:
-define(S,<<"M\0\0\0522039355099,010100000008,0,010170000000,0,0,0,0,0,0,,,0,0,,‌​0110,00,150,0,0,0\0">>).<<Message_length:4/binary,Msg/binary>> = S.
the first 4 bytes are the length of the following message, the other byte are the message,
a null byte terminates the string.
The result is:
** exception error: o match of right hand side value
EDIT
Just before the given code, there is:
[Sequence|Reste] = binary:split(T,<<"\0">>),
Does "Reste" bounded ?
Your code is ok, so either you dont have a binary string, or the length of Mystring does not comply with the pattern. Here's a quick test:
1> Mystring = <<"abcde">>.
<<"abcde">>
2> <<Message_length:4/binary,Msg/binary>> = Mystring.
<<"abcde">>
3> Message_length.
<<"abcd">>
4> Msg.
<<"e">>
If you have a string (a list of integers) instead of a binary string (<<"string">>), as Vincenzo suggested, call erlang:list_to_binary/1 first.
Hope it helps
EDIT: I've checked the example string you left in a comment of Vincenzo's answer. I've tried it with your code and still works. Is it possible that Message_length and/or Msg are already bound (and different to Mystring) when reaching that line of code? That would make the pattern matching fail.
EDIT2: Tested with the updated data in the question:
1> S = <<"M\0\0\0522039355099,010100000008,0,010170000000,0,0,0,0,0,0,,,0,0,,\342\200\214\342\200\2130110,00,150,0,0,0\0">>.
<<77,0,0,42,50,48,51,57,51,53,53,48,57,57,44,48,49,48,49,
48,48,48,48,48,48,48,56,44,48,...>>
2> <<Message_length:4/binary,Msg/binary>> = S.
<<77,0,0,42,50,48,51,57,51,53,53,48,57,57,44,48,49,48,49,
48,48,48,48,48,48,48,56,44,48,...>>
3> Message_length.
<<77,0,0,42>>
4> Msg.
<<"2039355099,010100000008,0,010170000000,0,0,0,0,0,0,,,0,0,,\342"...>>
There is issue with erlang string escape interpolation. The fourth byte is not interpolated as "\0" but "\052".
1> Bin = <<"M\0\0\0522039355099,010100000008,0,010170000000,0,0,0,0,0,0,,,0,0,,0110,00,150,0,0,0\0">>.
<<77,0,0,42,50,48,51,57,51,53,53,48,57,57,44,48,49,48,49,
48,48,48,48,48,48,48,56,44,48,...>>
So you have to write it in this manner.
2> f().
ok
3> Bin = <<"M\0\0\0","522039355099,010100000008,0,010170000000,0,0,0,0,0,0,,,0,0,,0110,00,150,0,0,0\0">>.
<<77,0,0,0,53,50,50,48,51,57,51,53,53,48,57,57,44,48,49,
48,49,48,48,48,48,48,48,48,56,...>>
Then usual way to parse this form of messages is:
4> <<L:32/little,Rest/binary>> = Bin.
<<77,0,0,0,53,50,50,48,51,57,51,53,53,48,57,57,44,48,49,
48,49,48,48,48,48,48,48,48,56,...>>
5> L.
77
6> <<Msg:L/binary,R/binary>> = Rest.
<<"522039355099,010100000008,0,010170000000,0,0,0,0,0,0,,,0,0,,0110,00,150,0,0,0"...>>
7> R.
<<0>>
8> Msg.
<<"522039355099,010100000008,0,010170000000,0,0,0,0,0,0,,,0,0,,0110,00,150,0,0,0">>
You have to call list_to_binary/1 on string to be matched.
If you have further problems, type example string please!

Resources