confusion regarding erlang maps, lists and ascii - erlang

This code is an excerpt from this book.
count_characters(Str) ->
count_characters(Str, #{}).
count_characters([H|T], #{ H => N }=X) ->
count_characters(T, X#{ H := N+1 });
count_characters([H|T], X) ->
count_characters(T, X#{ H => 1 });
count_characters([], X) ->
X.
So,
1> count_characters("hello").
#{101=>1,104=>1,108=>2,111=>1}
What I understand from this is that, count_characters() takes an argument hello, and place it to the first function, i.e count_characters(Str).
What I don't understand is, how the string characters are converted into ascii value without using $, and got incremented. I am very new to erlang, and would really appreciate if you could help me understand the above. Thank you.

In erlang the string literal "hello" is just a more convenient way of writing the list [104,101,108,108,111]. The string format is syntactic sugar and nothing erlang knows about internally. An ascii string is internally string is internally stored as a list of 32-bit integers.
This also becomes confusing when printing lists where the values happen to be within the ascii range:
io:format("~p~n", [[65,66]]).
will print
"AB"
even if you didn't expect a string as a result.

As said previously, there is no string data type in Erlang, it uses the internal representation of an integer list, so
"hello" == [$h,$e,$l,$l,$o] == [104|[101|[108|[108|[111|[]]]]]]
Which are each a valid representation of an integer list.
To make the count of characters, the function use a new Erlang data type: a map. (available only since R17)
A map is a collection of key/value pairs, in your case the keys will be the characters, and the values the occurrence of each characters.
The function is called with an empty map:count_characters(Str, #{}).
Then it goes recursively through the list, and for each head H, 2 cases are posible:
The character H was already found, then the current map X will match with the pattern #{ H => N } telling us that we already found N times H, so we continue the recursion with the rest of the list and a new map where the value associated to H is now N+1: count_characters(T, X#{ H := N+1 }.
The character H is found for the first time, then we continue the recursion with the rest of the list and a new map where the key/value pair H/1 is added: count_characters(T, X#{ H => 1 }).
When the end of the list is reached, simply return the map: count_characters([], X) -> X.

Related

What does (_,[]) mean?

I was given a question which was:
given a number N in the first argument selects only numbers greater than N in the list, so that
greater(2,[2,13,1,4,13]) = [13,4,13]
This was the solution provided:
member(_,[]) -> false;
member(H,[H|_]) -> true;
member(N,[_,T]) -> member(N,T).
I don't understand what "_" means. I understand it has something to do with pattern matching but I don't understand it completely. Could someone please explain this to me
This was the solution provided:
I think you are confused: the name of the solution function isn't even the same as the name of the function in the question. The member/2 function returns true when the first argument is an element of the list provided as the second argument, and it returns false otherwise.
I don't understand what "_" means. I understand it has something to do with pattern matching but I don't understand it completely. Could someone please explain this to me
_ is a variable name, and like any variable it will match anything. Here are some examples of pattern matching:
35> f(). %"Forget" or erase all variable bindings
ok
45> {X, Y} = {10, 20}.
{10,20}
46> X.
10
47> Y.
20
48> {X, Y} = {30, 20}.
** exception error: no match of right hand side value {30,
20}
Now why didn't line 48 match? X was already bound to 10 and Y to 20, so erlang replaces those variables with their values, which gives you:
48> {10, 20} = {30, 20}.
...and those tuples don't match.
Now lets try it with a variable named _:
49> f().
ok
50> {_, Y} = {10, 20}.
{10,20}
51> Y.
20
52> {_, Y} = {30, 20}.
{30,20}
53>
As you can see, the variable _ sort of works like the variable X, but notice that there is no error on line 52, like there was on line 48. That's because the _ variable works a little differently than X:
53> _.
* 1: variable '_' is unbound
In other words, _ is a variable name, so it will initially match anything, but unlike X, the variable _ is never bound/assigned a value, so you can use it over and over again without error to match anything.
The _ variable is also known as a don't care variable because you don't care what that variable matches because it's not important to your code, and you don't need to use its value.
Let's apply those lessons to your solution. This line:
member(N,[_,T]) -> member(N,T).
recursively calls the member function, namely member(N, T). And, the following function clause:
member(_,[]) -> false;
will match the function call member(N, T) whenever T is an empty list--no matter what the value of N is. In other words, once the given number N has not matched any element in the list, i.e. when the list is empty so there are no more elements to check, then the function clause:
member(_,[]) -> false;
will match and return false.
You could rewrite that function clause like this:
member(N, []) -> false;
but erlang will warn you that N is an unused variable in the body of the function, which is a way of saying: "Are you sure you didn't make a mistake in your function definition? You defined a variable named N, but then you didn't use it in the body of the function!" The way you tell erlang that the function definition is indeed correct is to change the variable name N to _ (or _N).
It means a variable you don't care to name. If you are never going to use a variable inside the function you can just use underscore.
% if the list is empty, it has no members
member(_, []) -> false.
% if the element I am searching for is the head of the list, it is a member
member(H,[H|_]) -> true.
% if the elem I am searching for is not the head of the list, and the list
% is not empty, lets recursively go look at the tail of the list to see if
% it is present there
member(H,[_|T]) -> member(H,T).
the above is pseudo code for what is happening. You can also have multiple '_' unnamed variables.
According to Documentation:
The anonymous variable is denoted by underscore (_) and can be used when a variable is required but its value can be ignored.
Example:
[H, _] = [1,2] % H will be 1
Also documentation says that:
Variables starting with underscore (_), for example, _Height, are normal variables, not anonymous. They are however ignored by the compiler in the sense that they do not generate any warnings for unused variables.
Sorry if this is repetitive...
What does (_,[]) mean?
That means (1) two parameters, (2) the first one matches anything and everything, yet I don't care about it (you're telling Erlang to just forget about its value via the underscore) and (3) the second parameter is an empty list.
Given that Erlang binds or matches values with variables (depending on the particular case), here you're basically looking to a match (like a conditional statement) of the second parameter with an empty list. If that match happens, the statement returns false. Otherwise, it tries to match the two parameters of the function call with one of the other two statements below it.

Performant, idiomatic way to concatenate two chars that are not in a list into a string

I've done most of my development in C# and am just learning F#. Here's what I want to do in C#:
string AddChars(char char1, char char2) => char1.ToString() + char2.ToString();
EDIT: added ToString() method to the C# example.
I want to write the same method in F# and I don't know how to do it other than this:
let addChars char1 char2 = Char.ToString(char1) + Char.ToString(char2)
Is there a way to add concatenate these chars into a string without converting both into strings first?
Sidenote:
I also have considered making a char array and converting that into a string, but that seems similarly wasteful.
let addChars (char1:char) (char2: char) = string([|char1; char2|])
As I said in my comment, your C# code is not going to do what you want ( i.e. concatenate the characters into a string). In C#, adding a char and a char will result in an int. The reason for this is because the char type doesn't define a + operator, so C# reverts to the nearest compatable type that does, which just happens to be int. (Source)
So to accomplish this behavior, you will need to do something similar to what you are already trying to do in F#:
char a = 'a';
char b = 'b';
// This is the wrong way to concatenate chars, because the
// chars will be treated as ints and the result will be 195.
Console.WriteLine(a + b);
// These are the correct ways to concatenate characters into
// a single string. The result of all of these will be "ab".
// The third way is the recommended way as it is concise and
// involves creating the fewest temporary objects.
Console.WriteLine(a.ToString() + b.ToString());
Console.WriteLine(Char.ToString(a) + Char.ToString(b));
Console.WriteLine(new String(new[] { a, b }));
(See https://dotnetfiddle.net/aEh1FI)
F# is the same way in that concatenating two or more chars doesn't result in a String. Unlike C#, it results instead in another char, but the process is the same - the char values are treated like int and added together, and the result is the char representation of the sum.
So really, the way to concatenate chars into a String in F# is what you already have, and is the direct translation of the C# equivalent:
let a = 'a'
let b = 'b'
// This is still the wrong way (prints 'Ã')
printfn "%O" (a + b)
// These are still the right ways (prints "ab")
printfn "%O" (a.ToString() + b.ToString())
printfn "%O" (Char.ToString(a) + Char.ToString(b))
printfn "%O" (String [| a;b |]) // This is still the best way
(See https://dotnetfiddle.net/ALwI3V)
The reason the "String from char array" approach is the best way is two-fold. First, it is the most concise, since you can see that that approach offers the shortest line of code in both languages (and the difference only increases as you add more and more chars together). And second, only one temporary object is created (the array) before the final String, whereas the other two methods involve making two separate temporary String objects to feed into the final result.
(Also, I'm not sure if it works this way as the String constructors are hidden in external sources, but I imagine that the array passed into the constructor would be used as the String's backing data, so it wouldn't end up getting wasted at all.) Strings are immutable, but using the passed array directly as the created String's backing data could result in a situation where a reference to the array could be held elsewhere in the program and jeopardize the String's immutability, so this speculation wouldn't fly in practice. (Credit: #CaringDev)
Another option you could do in F# that could be more idiomatic is to use the sprintf function to combine the two characters (Credit: #rmunn):
let a = 'a'
let b = 'b'
let s = sprintf "%c%c" a b
printfn "%O" s
// Prints "ab"
(See https://dotnetfiddle.net/Pp9Tee)
A note of warning about this method, however, is that it is almost certainly going to be much slower than any of the other three methods listed above. That's because instead of processing array or String data directly, sprintf is going to be performing more advanced formatting logic on the output. (I'm not in a position where I could benchmark this myself at the moment, but plugged into #TomasPetricek's benckmarking code below, I wouldn't be surprised if you got performance hits of 10x or more.)
This might not be a big deal as for a single conversion it will still be far faster than any end-user could possibly notice, but be careful if this is going to be used in any performance-critical code.
The answer by #Abion47 already lists all the possible sensible methods I can think of. If you are interested in performance, then you can run a quick experiment using the F# Interactive #time feature:
#time
open System
open System.Text
let a = 'a'
let b = 'b'
Comparing the three methods, the one with String [| a; b |] turns out to be about twice as fast as the methods involving ToString. In practice, that's probably not a big deal unless you are doing millions of such operations (as my experiment does), but it's an interesting fact to know:
// 432ms, 468ms, 472ms
for i in 0 .. 10000000 do
let s = a.ToString() + b.ToString()
ignore s
// 396ms 440ms, 458ms
for i in 0 .. 10000000 do
let s = Char.ToString(a) + Char.ToString(b)
ignore s
// 201ms, 171ms, 170ms
for i in 0 .. 10000000 do
let s = String [| a;b |]
ignore s

lua - table.concat with string keys

I'm having a problem with lua's table.concat, and suspect it is just my ignorance, but cannot find a verbose answer to why I'm getting this behavior.
> t1 = {"foo", "bar", "nod"}
> t2 = {["foo"]="one", ["bar"]="two", ["nod"]="yes"}
> table.concat(t1)
foobarnod
> table.concat(t2)
The table.concat run on t2 provides no results. I suspect this is because the keys are strings instead of integers (index values), but I'm not sure why that matters.
I'm looking for A) why table.concat doesn't accept string keys, and/or B) a workaround that would allow me to concatenate a variable number of table values in a handful of lines, without specifying the key names.
Because that's what table.concat is documented as doing.
Given an array where all elements are strings or numbers, returns table[i]..sep..table[i+1] ··· sep..table[j]. The default value for sep is the empty string, the default for i is 1, and the default for j is the length of the table. If i is greater than j, returns the empty string.
Non-array tables have no defined order so table.concat wouldn't be all that helpful then anyway.
You can write your own, inefficient, table concat function easily enough.
function pconcat(tab)
local ctab, n = {}, =1
for _, v in pairs(tab) do
ctab[n] = v
n = n + 1
end
return table.concat(ctab)
end
You could also use next manually to do the concat, etc. yourself if you wanted to construct the string yourself (though that's probably less efficient then the above version).

Binary to Integer -> Erlang

I have a binary M such that 34= will always be present and the rest may vary between any number of digits but will always be an integer.
M = [<<"34=21">>]
When I run this command I get an answer like
hd([X || <<"34=", X/binary >> <- M])
Answer -> <<"21">>
How can I get this to be an integer with the most care taken to make it as efficient as possible?
[<<"34=",X/binary>>] = M,
list_to_integer(binary_to_list(X)).
That yields the integer 21
As of R16B, the BIF binary_to_integer/1 can be used:
OTP-10300
Added four new bifs, erlang:binary_to_integer/1,2,
erlang:integer_to_binary/1, erlang:binary_to_float/1 and
erlang:float_to_binary/1,2. These bifs work similarly to how
their list counterparts work, except they operate on
binaries. In most cases converting from and to binaries is
faster than converting from and to lists.
These bifs are auto-imported into erlang source files and can
therefore be used without the erlang prefix.
So that would look like:
[<<"34=",X/binary>>] = M,
binary_to_integer(X).
A string representation of a number can be converted by N-48. For multi-digit numbers you can fold over the binary, multiplying by the power of the position of the digit:
-spec to_int(binary()) -> integer().
to_int(Bin) when is_binary(Bin) ->
to_int(Bin, {size(Bin), 0}).
to_int(_, {0, Acc}) ->
erlang:trunc(Acc);
to_int(<<N/integer, Tail/binary>>, {Pos, Acc}) when N >= 48, N =< 57 ->
to_int(Tail, {Pos-1, Acc + ((N-48) * math:pow(10, Pos-1))}).
The performance of this is around 100 times slower than using the list_to_integer(binary_to_list(X)) option.

Is it possible to use record name as a parameter?

Lets say I have a record:
-record(foo, {bar}).
What I would like to do is to be able to pass the record name to a function as a parameter, and get back a new record. The function should be generic so that it should be able to accept any record, something like this.
make_record(foo, [bar], ["xyz"])
When implementing such a function I've tried this:
make_record(RecordName, Fields, Values) ->
NewRecord = #RecordName{} %% this line gives me an error: syntax error before RecordName
Is it possible to use the record name as a parameter?
You can't use the record syntax if you don't have access to the record during compile time.
But because records are simply transformed into tuples during compilation it is really easy to construct them manually:
-record(some_rec, {a, b}).
make_record(Rec, Values) ->
list_to_tuple([Rec | Values]).
test() ->
R = make_record(some_rec, ["Hej", 5]), % Dynamically create record
#some_rec{a = A, b = B} = R, % Access it using record syntax
io:format("a = ~p, b = ~p~n", [A, B]).
Alternative solution
Or, if you at compile time make a list of all records that the function should be able to construct, you can use the field names also:
%% List of record info created with record_info macro during compile time
-define(recs,
[
{some_rec, record_info(fields, some_rec)},
{some_other_rec, record_info(fields, some_other_rec)}
]).
make_record_2(Rec, Fields, Values) ->
ValueDict = lists:zip(Fields, Values),
% Look up the record name and fields in record list
Body = lists:map(
fun(Field) -> proplists:get_value(Field, ValueDict, undefined) end,
proplists:get_value(Rec, ?recs)),
list_to_tuple([Rec | Body]).
test_2() ->
R = make_record_2(some_rec, [b, a], ["B value", "A value"]),
#some_rec{a = A, b = B} = R,
io:format("a = ~p, b = ~p~n", [A, B]).
With the second version you can also do some verification to make sure you are using the right fields etc.
Other tips
Other useful constructs to keep in mind when working with records dynamically is the #some_rec.a expression which evaluates to the index of the a field in some_recs, and the element(N, Tuple) function which given a tuple and an index returns the element in that index.
This is not possible, as records are compile-time only structures. At compilation they are converted into tuples. Thus the compiler needs to know the name of the record, so you cannot use a variable.
You could also use some parse-transform magic (see exprecs) to create record constructors and accessors, but this design seems to go in the wrong direction.
If you need to dynamically create record-like things, you can use some structures instead, like key-value lists, or dicts.
To cover all cases: If you have fields and values but don't necessarily have them in the correct order, you could make your function take in the result of record_info(fields, Record), with Record being the atom of the record you want to make. Then it'll have the ordered field names to work with. And a record is just a tuple with its atom name in the first slot, so you can build it that way. Here's how I build an arbitrary shallow record from a JSON string (not thoroughly tested and not optimized, but tested and working):
% Converts the given JSON string to a record
% WARNING: Only for shallow records. Won't work for nested ones!
%
% Record: The atom representing the type of record to be converted to
% RecordInfo: The result of calling record_info(fields, Record)
% JSON: The JSON string
jsonToRecord(Record, RecordInfo, JSON) ->
JiffyList = element(1, jiffy:decode(JSON)),
Struct = erlang:make_tuple(length(RecordInfo)+1, ""),
Struct2 = erlang:setelement(1, Struct, Record),
recordFromJsonList(RecordInfo, Struct2, JiffyList).
% private methods
recordFromJsonList(_RecordInfo, Struct, []) -> Struct;
recordFromJsonList(RecordInfo, Struct, [{Name, Val} | Rest]) ->
FieldNames = atomNames(RecordInfo),
Index = index_of(erlang:binary_to_list(Name), FieldNames),
recordFromJsonList(RecordInfo, erlang:setelement(Index+1, Struct, Val), Rest).
% Converts a list of atoms to a list of strings
%
% Atoms: The list of atoms
atomNames(Atoms) ->
F = fun(Field) ->
lists:flatten(io_lib:format("~p", [Field]))
end,
lists:map(F, Atoms).
% Gets the index of an item in a list (one-indexed)
%
% Item: The item to search for
% List: The list
index_of(Item, List) -> index_of(Item, List, 1).
% private helper
index_of(_, [], _) -> not_found;
index_of(Item, [Item|_], Index) -> Index;
index_of(Item, [_|Tl], Index) -> index_of(Item, Tl, Index+1).
Brief explanation: The JSON represents some key:value pairs corresponding to field:value pairs in the record we're trying to build. We might not get the key:value pairs in the correct order, so we need the list of record fields passed in so we can insert the values into their correct positions in the tuple.

Resources