Related
I'm trying to implement the a split method in Erlang that is supposed to split a string like "i am on the mountain top" into a list like ["i","am","on","the","mountain","top"].
Here is my code (exercise.erl):
-module(exercise).
-import(oi,[read/1]).
-export([split/4]).
split(Text,_,Result,_) when Text == [] -> Result;
split([Head|Tail],Separator,Result,WordSummer) when Head == Separator ->
split(Tail,Separator,[Result|lists:flatten(WordSummer)],[]);
split([Head|Tail],Separator,Result,WordSummer) ->
split(Tail,Separator,Result,[WordSummer|Head]).
The problem I'm having is that when calling my exported function I get the following error:
9> c(exercise).
{ok,exercise}
10> exercise:split("sdffdgfdg dgdfgfg dgdfg dgdfgd dfgdfgdfgtrty hghfgh",$ ,[],[]).
** exception error: no function clause matching lists:do_flatten(103,[]) (lists.erl, line 627)
in function lists:do_flatten/2 (lists.erl, line 628)
in call from exercise:split/4 (exercise.erl, line 9)
11>
How can I solve this?
Two things:
The [WordSummer|Head] in the last line is creating an improper list because Head is an integer (one character of the input string). This is causing the error you're seeing. You probably meant [WordSummer, Head].
[Result|lists:flatten(WordSummer)] is creating a nested list instead of a list of strings. To append one item to a list, use ++ and wrap the right side in a list: Result ++ [lists:flatten(WordSummer)]
Final code:
split(Text,_,Result,_) when Text == [] -> Result;
split([Head|Tail],Separator,Result,WordSummer) when Head == Separator ->
split(Tail,Separator,Result ++ [lists:flatten(WordSummer)],[]);
split([Head|Tail],Separator,Result,WordSummer) ->
split(Tail,Separator,Result,[WordSummer, Head]).
Test:
1> c(exercise).
{ok,exercise}
2> exercise:split("sdffdgfdg dgdfgfg dgdfg dgdfgd dfgdfgdfgtrty hghfgh",$ ,[],[]).
["sdffdgfdg","dgdfgfg","dgdfg","dgdfgd","dfgdfgdfgtrty"]
There's still a bug where the last segment is being ignored. I'll let you figure that out (hint: you need to consider WordSummer in the first clause of the function).
Why is the following saying variable unbound?
9> {<<A:Length/binary, Rest/binary>>, Length} = {<<1,2,3,4,5>>, 3}.
* 1: variable 'Length' is unbound
It's pretty clear that Length should be 3.
I am trying to have a function with similar pattern matching, ie.:
parse(<<Body:Length/binary, Rest/binary>>, Length) ->
But if fails with the same reason. How can I achieve the pattern matching I want?
What I am really trying to achieve is parse in incoming tcp stream packets as LTV(Length, Type, Value).
At some point after I parse the the Length and the Type, I want to ready only up to Length number of bytes as the value, as the rest will probably be for the next LTV.
So my parse_value function is like this:
parse_value(Value0, Left, Callback = {Module, Function},
{length, Length, type, Type, value, Value1}) when byte_size(Value0) >= Left ->
<<Value2:Left/binary, Rest/binary>> = Value0,
Module:Function({length, Length, type, Type, value, lists:reverse([Value2 | Value1])}),
if
Rest =:= <<>> ->
{?MODULE, parse, {}};
true ->
parse(Rest, Callback, {})
end;
parse_value(Value0, Left, _, {length, Length, type, Type, value, Value1}) ->
{?MODULE, parse_value, Left - byte_size(Value0), {length, Length, type, Type, value, [Value0 | Value1]}}.
If I could do the pattern matching, I could break it up to something more pleasant to the eye.
The rules for pattern matching are that if a variable X occurs in two subpatterns, as in {X, X}, or {X, [X]}, or similar, then they have to have the same value in both positions, but the matching of each subpattern is still done in the same input environment - bindings from one side do not carry over to the other. The equality check is conceptually done afterwards, as if you had matched on {X, X2} and added a guard X =:= X2. This means that your Length field in the tuple cannot be used as input to the binary pattern, not even if you make it the leftmost element.
However, within a binary pattern, variables bound in a field can be used in other fields following it, left-to-right. Therefore, the following works (using a leading 32-bit size field in the binary):
1> <<Length:32, A:Length/binary, Rest/binary>> = <<0,0,0,3,1,2,3,4,5>>.
<<0,0,0,3,1,2,3,4,5>>
2> A.
<<1,2,3>>
3> Rest.
<<4,5>>
I've run into this before. There is some weirdness between what is happening inside binary syntax and what happens during unification (matching). I suspect that it is just that binary syntax and matching occur at different times in the VM somewhere (we don't know which Length is failing to get assigned -- maybe binary matching is always first in evaluation, so Length is still meaningless). I was once going to dig in and find out, but then I realized that I never really needed to solve this problem -- which might be why it was never "solved".
Fortunately, this won't stop you with whatever you are doing.
Unfortunately, we can't really help further unless you explain the context in which you think this kind of a match is a good idea (you are having an X-Y problem).
In binary parsing you can always force the situation to be one of the following:
Have a fixed-sized header at the beginning of the binary message that tells you the next size element you need (and from there that can continue as a chain of associations endlessly)
Inspect the binary once on entry to determine the size you are looking for, pull that one value, and then begin the real parsing task
Have a set of fields, all of predetermined sizes that conform to some a binary schema standard
Convert the binary to a list and iterate through it with any arbitrary amount of look-ahead and backtracking you might need
Quick Solution
Without knowing anything else about your general problem, a typical solution would look like:
parse(Length, Bin) ->
<<Body:Length/binary, Rest/binary>> = Bin,
ok = do_something(Body),
do_other_stuff(Rest).
But I smell something funky here.
Having things like this in your code is almost always a sign that a more fundamental aspect of the code structure is not in agreement with the data that you are handling.
But deadlines.
Erlang is all about practical code that satisfies your goals in the real world. With that in mind, I suggest that you do something like the above for now, and then return to this problem domain and rethink it. Then refactor it. This will gain you three benefits:
Something will work right away.
You will later learn something fundamental about parsing in general.
Your code will almost certainly run faster if it fits your data better.
Example
Here is an example in the shell:
1> Parse =
1> fun
1> (Length, Bin) when Length =< byte_size(Bin) ->
1> <<Body:Length/binary, Rest/binary>> = Bin,
1> ok = io:format("Chopped off ~p bytes: ~p~n", [Length, Body]),
1> Rest;
1> (Length, Bin) ->
1> ok = io:format("Binary shorter than ~p~n", [Length]),
1> Bin
1> end.
#Fun<erl_eval.12.87737649>
2> Parse(3, <<1,2,3,4,5>>).
Chopped off 3 bytes: <<1,2,3>>
<<4,5>>
3> Parse(8, <<1,2,3,4,5>>).
Binary shorter than 8
<<1,2,3,4,5>>
Note that this version is a little safer, in that we avoid a crash in the case that Length is longer than the binary. This is yet another good reason why maybe we can't do that match in the function head.
Try with below code:
{<<A:Length/binary, Rest/binary>>, _} = {_, Length} = {<<1,2,3,4,5>>, 3}.
This question is mentioned a bit in EEP-52:
Any variables used in the expression must have been previously bound, or become bound in the same binary pattern as the expression. That is, the following example is illegal:
illegal_example2(N, <<X:N,T/binary>>) ->
{X,T}.
And explained a bit more in the following e-mail: http://erlang.org/pipermail/eeps/2020-January/000636.html
Illegal. With one exception, matching is not done in a left-to-right
order, but all variables in the pattern will be bound at the same
time. That means that the variables must be bound before the match
starts. For maps, that means that the variables referenced in key
expressions must be bound before the case (or receive) that matches
the map. In a function head, all map keys must be literals.
The exception to this general rule is that within a binary pattern,
the segments are matched from left to right, and a variable bound in a
previous segment can be used in the size expression for a segment
later in the binary pattern.
Also one of the members of OTP team mentioned that they made a prototype that can do that, but it was never finished http://erlang.org/pipermail/erlang-questions/2020-May/099538.html
We actually tried to make your example legal. The transformation of
the code that we did was not to rewrite to guards, but to match
arguments or parts of argument in the right order so that variables
that input variables would be bound before being used. (We would do a
topological sort to find the correct order.) For your example, the
transformation would look similar to this:
legal_example(Key, Map) ->
case Map of
#{Key := Value} -> Value;
_ -> error(function_clause, [Key, Map])
end.
In the prototype implementation, the compiler could compile the
following example:
convoluted(Ref,
#{ node(Ref) := NodeId, Loop := universal_answer},
[{NodeId, Size} | T],
<<Int:(Size*8+length(T)),Loop>>) when is_reference(Ref) ->
Int.
Things started to fall apart when variables are repeated. Repeated
variables in patterns already have a meaning in Erlang (they should be
the same), so it become tricky to understand to distinguish between
variables being bound or variables being used a binary size or map
key. Here is an example that the prototype couldn't handle:
foo(#{K := K}, K) -> ok.
A human can see that it should be transformed similar to this:
foo(Map, K) -> case Map of
{K := V} when K =:= V -> ok end.
Here are few other examples that should work but the prototype would
refuse to compile (often emitting an incomprehensible error message):
bin2(<<Sz:8,X:Sz>>, <<Y:Sz>>) -> {X,Y}.
repeated_vars(#{K := #{K := K}}, K) -> K.
match_map_bs(#{K1 := {bin,<<Int:Sz>>}, K2 := <<Sz:8>>}, {K1,K2}) ->
Int.
Another problem was when example was correctly rejected, the error
message would be confusing.
Because much more work would clearly be needed, we have shelved the
idea for now. Personally, I am not sure that the idea is sound in the
first place. But I am sure of one thing: the implementation would be
very complicated.
UPD: latest news from 2020-05-14
I have this code:
-module(info).
-export([map_functions/0]).
-author("me").
map_functions() ->
{Mod,_} = code:all_loaded(),
map_functions(Mod,#{});
map_functions([H|Tail],A) ->
B = H:mod_info(exports),
map_functions(Tail,A#{H => B});
map_functions([],A) -> A.
However whenever I compile it I get a head mismatch on line 10 which is the
map_funtions([H|Tail],A) ->
I'm sure this is a very basic error but I just cannot get my head around why this does not run. It is a correct pattern match syntax [H|Tail] and the three functions with the same name but different arities are separated by commas.
Your function definition should be
map_functions() ->
{Mod,_} = code:all_loaded(),
map_functions(Mod, #{}).
map_functions([], A) -> A;
map_functions([H|Tail], A) ->
B = H:mod_info(exports),
map_functions(Tail, A#{H => B}).
The name map_functions is the same, but the arity is not. In the Erlang world that means these are two entirely different functions: map_functions/0 and map_functions/2.
Also, note that I put the "base case" first in map_functions/2 (and made the first clause's return value stick out -- breaking that to two lines is more common, but whatever). This is for three reasons: clarity, getting in the habit of writing the base case first (so you don't accidentally write infinite loops), and very often it is necessary to do this so you don't accidentally mask your base case by matching every parameter in a higher-precedence clause.
Some extended discussion on this topic is here (addressing Elixir and Erlang): Specify arity using only or except when importing function on Elixir
Function with same name but different arity are different, they are separated by dots.
code:all_loaded returns a list, so the first function should be written:
map_functions() ->
Mods = code:all_loaded(),
map_functions(Mods, #{}).
The resulting list Mods is a list of tuples of the form {ModName,BeamLocation} so the second function should be written:
map_functions([], A) -> A;
map_functions([{ModName,_}|Tail], A) ->
B = ModName:module_info(exports),
map_functions(Tail, A#{ModName => B}).
Note that you should dig into erlang libraries and try to use more idiomatic forms of code, the whole function, using list comprehension, can be written:
map_functions() ->
maps:from_list([{X,X:module_info(exports)} || {X,_} <- code:all_loaded()]).
I'm trying to learn a little of the mindset of functional programming in F#, so any tips are appreciated. Right now I'm making a simple recursive function which takes a list and returns the i:th element.
let rec nth(list, i) =
match (list, i) with
| (x::xs, 0) -> x
| (x::xs, i) -> nth(xs, i-1)
The function itself seems to work, but it warns me about an incomplete pattern. I'm not sure what to return when I match the empty list in this case, since if I for example do the following:
| ([], _) -> ()
The whole function is treated like a function that takes a unit as argument. I want it to treat is as a polymorphic function.
While I'm at it, I may as well ask how far is reasonable to go to check for valid arguments when designing a function when developing seriously. Should I check for everything, so "misuse" of the function is prevented? In the above example I could for example specify the function to try to access an element in the list that is larger than its size. I hope my question isn't too confusing :)
You can learn a lot about the "usual" library design by looking at the standard F# libraries. There is already a function that does what you want called List.nth, but even if you're implementing this as an exercise, you can check how the function behaves:
> List.nth [ 1 .. 3 ] 10;;
System.ArgumentException: The index was outside the range
of elements in the list. Parameter name: index
The function throws System.ArgumentException with some additional information about the exception, so that users can easily find out what went wrong. To implement the same functionality, you can use the invalidArg function:
| _ -> invalidArg "index" "Index is out of range."
This is probably better than just using failwith which throws a more general exception. When using invalidArg, users can check for a specific type of exceptions.
As kvb noted, another option is to return option 'a. Many standard library functions provide both a version that returns option and a version that throws an exception. For example List.pick and List.tryPick. So, maybe a good design in your case would be to have two functions - nth and tryNth.
If you want your function to return a meaningful result and to have the same type as it has now, then you have no alternative but to throw an exception in the remaining case. A matching failure will throw an exception, so you don't need to change it, but you may find it preferable to throw an exception with more relevant information:
| _ -> failwith "Invalid list index"
If you expect invalid list indices to be rare, then this is probably good enough. However, another alternative would be to change your function so that it returns an 'a option:
let rec nth = function
| x::xs, 0 -> Some(x)
| [],_ -> None
| _::xs, i -> nth(xs, i-1)
This places an additional burden on the caller, who must now explicitly deal with the possibility of failure.
Presumably, if taking an empty list is invalid, you're best off just throwing an exception?
Generally the rules for how defensive you should be don't really change from language to language - I always go by the guideline that if it's public be paranoid about validating input, but if it's private code, you can be less strict. (Actually if it's a large project, and it's private code, be a little strict... basically strictness is proportional to the number of developers who might call your code.)
am josh in Uganda. i created a mnesia fragmented table (64 fragments), and managed to populate it upto 9948723 records. Each fragment was a disc_copies type, with two replicas.
Now, using qlc (query list comprehension), was too slow in searching for a record, and was returning inaccurate results.
I found out that this overhead is that qlc uses the select function of mnesia which traverses the entire table in order to match records. i tried something else below.
-define(ACCESS_MOD,mnesia_frag).
-define(DEFAULT_CONTEXT,transaction).
-define(NULL,'_').
-record(address,{tel,zip_code,email}).
-record(person,{name,sex,age,address = #address{}}).
match()-> Z = fun(Spec) -> mnesia:match_object(Spec) end,Z.
match_object(Pattern)->
Match = match(),
mnesia:activity(?DEFAULT_CONTEXT,Match,[Pattern],?ACCESS_MOD).
Trying this functionality gave me good results. But i found that i have to dynamically build patterns for every search that may be made in my stored procedures.
i decided to go through the havoc of doing this, so i wrote functions which will dynamically build wild patterns for my records depending on which parameter is to be searched.
%% This below gives me the default pattern for all searches ::= {person,'_','_','_'}
pattern(Record_name)->
N = length(my_record_info(Record_name)) + 1,
erlang:setelement(1,erlang:make_tuple(N,?NULL),Record_name).
%% this finds the position of the provided value and places it in that
%% position while keeping '_' in the other positions.
%% The caller function can use this function recursively until
%% it has built the full search pattern of interest
pattern({Field,Value},Pattern_sofar)->
N = position(Field,my_record_info(element(1,Pattern_sofar))),
case N of
-1 -> Pattern_sofar;
Int when Int >= 1 -> erlang:setelement(N + 1,Pattern_sofar,Value);
_ -> Pattern_sofar
end.
my_record_info(Record_name)->
case Record_name of
staff_dynamic -> record_info(fields,staff_dynamic);
person -> record_info(fields,person);
_ -> []
end.
%% These below,help locate the position of an element in a list
%% returned by "-record_info(fields,person)"
position(_,[]) -> -1;
position(Value,List)->
find(lists:member(Value,List),Value,List,1).
find(false,_,_,_) -> -1;
find(true,V,[V|_],N)-> N;
find(true,V,[_|X],N)->
find(V,X,N + 1).
find(V,[V|_],N)-> N;
find(V,[_|X],N) -> find(V,X,N + 1).
This was working very well though it was computationally intensive.
It could still work even after changing the record definition since at compile time, it gets the new record info
The problem is that when i initiate even 25 processes on a 3.0 GHz pentium 4 processor running WinXP, It hangs and takes a long time to return results.
If am to use qlc in these fragments, to get accurate results, i have to specify which fragment to search in like this.
find_person_by_tel(Tel)->
select(qlc:q([ X || X <- mnesia:table(Frag), (X#person.address)#address.tel == Tel])).
select(Q)->
case ?transact(fun() -> qlc:e(Q) end) of
{atomic,Val} -> Val;
{aborted,_} = Error -> report_mnesia_event(Error)
end.
Qlc was returning [], when i search for something yet when i use match_object/1 i get accurate results. I found that using match_expressions can help.
mnesia:table(Tab,Props).
where Props is a data structure that defines the match expression, the chunk size of return values e.t.c
I got a problem when i tried building match expressions dynamically.
Function mnesia:read/1 or mnesia:read/2 requires that you have the primary key
Now am asking myself, how can i efficiently use QLC to search for records in a large fragmented table? Please help.
I know that using tuple representation of records makes code hard to upgrade. This is why
i hate using mnesia:select/1, mnesia:match_object/1 and i want to stick to QLC. QLC is giving me wrong results in my queries from a mnesia table of 64 fragments even on the same node.
Has anyone ever used QLC to query a fragmented table?, please help
Do you invoke the qlc in the activity context?
tfn_match(Id) ->
Search = #person{address=#address{tel=Id, _ = '_'}, _ = '_'},
trans(fun() -> mnesia:match_object(Search) end).
tfn_qlc(Id) ->
Q = qlc:q([ X || X <- mnesia:table(person), (X#person.address)#address.tel == Id]),
trans(fun() -> qlc:e(Q) end).
trans(Fun) ->
try Res = mnesia:activity(transaction, Fun, mnesia_frag),
{atomic, Res}
catch exit:Error ->
{aborted, Error}
end.