Suppose I wanted to do something like:
dict
.values()
.map(fun scrub/1)
.flatMap(fun split/1)
.groupBy(fun keyFun/1, fun count/1)
.to_dict()
What is the most elegant way to achieve this in Erlang?
There is no direct easy way of doing that. All attempts I saw looked even worse than straightforward composition. If you will look at majority of open source project in Erlang, you will find that they use generic composition. Re-using your example:
to_dict(
groupBy(fun keyFun/1, fun count/1,
flatMap(fun split/1,
map(fun scrub/1,
values(dict))))).
This isn't a construct that's natural in Erlang. If you have a couple functions, regular composition is what I'd use:
lists:flatten(lists:map(fun (A) ->
do_stuff(A)
end,
generate_list())).
For a longer series of operations, intermediary variables:
Dict = #{hello => world, ...},
Values = maps:values(Dict),
ScrubbedValues = lists:map(fun scrub/1, Values),
SplitValues = lists:flatten(lists:map(fun split/1, ScrubbedValues)),
GroupedValues = basil_lists:group_by(fun keyFun/1, fun count/1, SplitValues),
Dict2 = maps:from_list(GroupedValues).
That's how it'd look if you wanted all of those operations grouped in one shot together.
However, I'd more likely write this in a different way:
-spec remap_values(map()) -> map().
remap_values(Map) ->
map_values(maps:values(Map)).
-spec map_values(list()) -> map().
map_values(Values) ->
map_values(Values, [], []).
-spec map_values(list(), list(), list()) -> map().
map_values([], OutList, OutGroup) ->
%% Base case: transform into a map
Grouped = lists:zip(OutGroup, OutList),
lists:foldl(fun ({Group, Element}, Acc = #{Group := Existing}) ->
Acc#{Group => [Element | Existing]};
({Group, Element}, Acc) ->
Acc#{Group => [Element]}
end,
#{},
Grouped;
map_values([First|Rest], OutList, OutGroup) ->
%% Recursive case: categorize process the first element and categorize the result
Processed = split(scrub(First)),
Categories = lists:map(fun categorize/1, Processed),
map_values(Rest, OutList ++ Processed, OutGroup ++ Categories).
The actual correct implementation depends a lot on how the code's going to be run -- what I've written here is pretty simple, but might not perform well on large amounts of data. If you're actually looking to process an endless stream of data you'll need to write that yourself (though you may find Gen Servers to be a very useful framework for doing so).
I'm trying to retrieve the path between two function from the call graph of a set of modules using xref.
Consider the following functions calling each other:
x:a/1 -> y:b/1 -> y:c/1
x:d/1 -> y:e/1
using the query: closure E | a:Mod || b:Mod will give me tuples of the start- and end-points of the paths of any direct or indirect call from module a to module b. Thus for the above example:
[{{x,a,1}, {y,b,1}},
{{x,a,1}, {y,c,1}},
{{x,d,1}, {y,e,1}}]
This is the set of paths through the call graph that I am looking for, but I need the inner verticies as well. For the above example this would be:
[[{x,a,1}, {y,b,1}],
[{x,a,1}, {y,b,1}, {y,c,1}],
[{x,d,1}, {y,e,1}]]
I have tried various variations of the examples given in the XRef documentation. I do understand that the query language operates on sets of verticies and edges, but fail to grasp a number of selection mechanism.
I am using the xref command of rebar3 to work with the queries, all the relevant code is in the project that I call rebar from. I am actually trying to show how the tests are calling the functions in the module.
Side question: Is there any more gentle introduction to the xref query language?
Hopefully you can use this code snippet as a starting point:
q(X) ->
{ok, L} = xref:q(X, "closure E | x:Mod || y:Mod"),
l(X, L).
l(_, []) -> [];
l(X, [{F,T}|L]) ->
Q = io_lib:format("{~p, ~p} of E", [F, T]),
{ok, Path} = xref:q(X, lists:flatten(Q)),
[Path | l(X, L)].
Hi I'm a newbie in Erlang and I just started learning about processes. Here I have a typical process loop:
loop(X,Y,Z) ->
receive
{do} ->
NewX = X+1,
NewY = Y+1,
NewZ = Z+1,
Product = NewX * NewY * NewZ,
% do something
loop(NewX,NewY,NewZ)
end.
How do I get the latest value of Product from a function let's say get_product()? I know that message passing will be the logical option but is there a more optimal way of extracting the value?
Here are methods to communicate between Erlang processes I am aware of, and my (possibly wrong) assessment of theirs relative performance.
Message passing. This method will suit most of your needs. I don't know how it is actually implemented, but from my point of view it should be as fast as putting a pointer into a queue and retrieving it back.
Exterior methods, e.g. sockets, files, pipes. These methods might be faster for communicating between different nodes, depending on a problem you solve, your solution and environment your program will be executed in. Inter-node communication in Erlang is done via TCP connections, so if you want to use self written code to communicate via TCP sockets, you should try really hard to outperform Erlang's implementation.
ETS, Dets. These methods won't be faster than message passing (ETS) or file (Dets) assuming best possible implementation.
NIF. You can write one method to save value in your NIF library and one to retrieve it. This one has a potential to outperform message passing since you can just save a value into a variable and return it back when needed and it has no overhead on pattern matching in receive.
Process dictionary. You can get another process dictionary using erlang:process_info(Pid, dictionary) call, in the Pid process you can put value in that dictionary using put(Key, Value) call.
Also, if you want to speed up your Erlang application take a look at HiPE, it might help.
Before switching from message passing to anything from this list to gain in speed you should measure it first!
I assumed this is what you want:
-module(lab).
-compile(export_all).
start() ->
InitialState = {1,1,1},
Pid = spawn(?MODULE, loop, [InitialState]),
register(server, Pid).
loop(State) ->
{X, Y, Z} = State,
receive
tick ->
NewX = X+1,
NewY = Y+1,
NewZ = Z+1,
NewState = {NewX, NewY, NewZ},
loop(NewState);
{get_product, From} ->
Product = X * Y * Z,
From ! Product,
loop(State);
_ ->
io:format("Unknown message received.~n"),
loop(State)
end.
get_product() ->
server ! {get_product, self()},
receive
Product ->
Product
end.
tick() ->
server ! tick.
From within the Erlang shell:
1> c(lab).
{ok,lab}
2> lab:start().
true
3> lab:get_product().
1
4> lab:tick().
tick
5> lab:get_product().
8
6> lab:tick().
tick
7> lab:tick().
tick
8> lab:get_product().
64
Disclaimer: I kept this because some things may be useful to others, however, it does not solve what I had initially tried to do.
Right now, I'm trying to solve the following:
Given something like {a, B, {c, D}} I want to scan through Erlang forms given to parse_transform/2 and find each use of the send operator (!). Then I want to check the message being sent and determine whether it would fit the pattern {a, B, {c, D}}.
Therefore, consider finding the following form:
{op,17,'!',
{var,17,'Pid'},
{tuple,17,[{atom,17,a},{integer,17,5},{var,17,'SomeVar'}]}}]}]}
Since the message being sent is:
{tuple,17,[{atom,17,a},{integer,17,5},{var,17,'SomeVar'}]}
which is an encoding of {a, 5, SomeVar}, this would match the original pattern of {a, B, {c, D}}.
I'm not exactly sure how I'm going to go about this but do you know of any API functions which could help?
Turning the given {a, B, {c, D}} into a form is possible by first substituting the variables with something, e.g. strings (and taking a note of this), else they'll be unbound, and then using:
> erl_syntax:revert(erl_syntax:abstract({a, "B", {c, "D"}})).
{tuple,0,
[{atom,0,a},
{string,0,"B"},
{tuple,0,[{atom,0,c},{string,0,"D"}]}]}
I was thinking that after getting them in the same format like this, I could analyze them together:
> erl_syntax:type({tuple,0,[{atom,0,a},{string,0,"B"},{tuple,0,[{atom,0,c},string,0,"D"}]}]}).
tuple
%% check whether send argument is also a tuple.
%% then, since it's a tuple, use erl_syntax:tuple_elements/1 and keep comparing in this way, matching anything when you come across a string which was a variable...
I think I'll end up missing something out (and for example recognizing some things but not others ... even though they should have matched).
Are there any API functions which I could use to ease this task? And as for a pattern match test operator or something along those lines, that does not exist right? (i.e. only suggested here: http://erlang.org/pipermail/erlang-questions/2007-December/031449.html).
Edit: (Explaining things from the beginning this time)
Using erl_types as Daniel suggests below is probably doable if you play around with the erl_type() returned by t_from_term/1 i.e. t_from_term/1 takes a term with no free variables so you'd have to stay changing something like {a, B, {c, D}} into {a, '_', {c, '_'}} (i.e. fill the variables), use t_from_term/1 and then go through the returned data structure and change the '_' atoms to variables using the module's t_var/1 or something.
Before explaining how I ended up going about it, let me state the problem a bit better.
Problem
I'm working on a pet project (ErlAOP extension) which I'll be hosting on SourceForge when ready. Basically, another project already exists (ErlAOP) through which one can inject code before/after/around/etc... function calls (see doc if interested).
I wanted to extend this to support injection of code at the send/receive level (because of another project). I've already done this but before hosting the project, I'd like to make some improvements.
Currently, my implementation simply finds each use of the send operator or receive expression and injects a function before/after/around (receive expressions have a little gotcha because of tail recursion). Let's call this function dmfun (dynamic match function).
The user will be specifying that when a message of the form e.g. {a, B, {c, D}} is being sent, then the function do_something/1 should be evaluated before the sending takes place. Therefore, the current implementation injects dmfun before each use of the send op in the source code. Dmfun would then have something like:
case Arg of
{a, B, {c, D}} -> do_something(Arg);
_ -> continue
end
where Arg can simply be passed to dmfun/1 because you have access to the forms generated from the source code.
So the problem is that any send operator will have dmfun/1 injected before it (and the send op's message passed as a parameter). But when sending messages like 50, {a, b}, [6, 4, 3] etc... these messages will certainly not match {a, B, {c, D}}, so injecting dmfun/1 at sends with these messages is a waste.
I want to be able to pick out plausible send operations like e.g. Pid ! {a, 5, SomeVar}, or Pid ! {a, X, SomeVar}. In both of these cases, it makes sense to inject dmfun/1 because if at runtime, SomeVar = {c, 50}, then the user supplied do_something/1 should be evaluated (but if SomeVar = 50, then it should not, because we're interested in {a, B, {c, D}} and 50 does not match {c, D}).
I wrote the following prematurely. It doesn't solve the problem I had. I ended up not including this feature. I left the explanation anyway, but if it were up to me, I'd delete this post entirely... I was still experimenting and I don't think what there is here will be of any use to anyone.
Before the explanation, let:
msg_format = the user supplied message format which will determine which messages being sent/received are interesting (e.g. {a, B, {c, D}}).
msg = the actual message being sent in the source code (e.g. Pid ! {a, X, Y}).
I gave the explanation below in a previous edit, but later found out that it wouldn't match some things it should. E.g. when msg_format = {a, B, {c, D}}, msg = {a, 5, SomeVar} wouldn't match when it should (by "match" I mean that dmfun/1 should be injected.
Let's call the "algorithm" outlined below Alg. The approach I took was to execute Alg(msg_format, msg) and Alg(msg, msg_format). The explanation below only goes through one of these. By repeating the same thing only getting a different matching function (matching_fun(msg_format) instead of matching_fun(msg)), and injecting dmfun/1 only if at least one of Alg(msg_format, msg) or Alg(msg, msg_format) returns true, then the result should be the injection of dmfun/1 where the desired message can actually be generated at runtime.
Take the message form you find in the [Forms] given to parse_transform/2 e.g. lets say you find: {op,24,'!',{var,24,'Pid'},{tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]}}
So you would take {tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]} which is the message being sent. (bind to Msg).
Do fill_vars(Msg) where:
-define(VARIABLE_FILLER, "_").
-spec fill_vars(erl_parse:abstract_form()) -> erl_parse:abstract_form().
%% #doc This function takes an abstract_form() and replaces all {var, LineNum, Variable} forms with
%% {string, LineNum, ?VARIABLE_FILLER}.
fill_vars(Form) ->
erl_syntax:revert(
erl_syntax_lib:map(
fun(DeltaTree) ->
case erl_syntax:type(DeltaTree) of
variable ->
erl_syntax:string(?VARIABLE_FILLER);
_ ->
DeltaTree
end
end,
Form)).
Do form_to_term/1 on 2's output, where:
form_to_term(Form) -> element(2, erl_eval:exprs([Form], [])).
Do term_to_str/1 on 3's output, where:
-define(inject_str(FormatStr, TermList), lists:flatten(io_lib:format(FormatStr, TermList))).
term_to_str(Term) -> ?inject_str("~p", [Term]).
Do gsub(v(4), "\"_\"", "_"), where v(4) is 4's output and gsub is: (taken from here)
gsub(Str,Old,New) -> RegExp = "\\Q"++Old++"\\E", re:replace(Str,RegExp,New,[global, multiline, {return, list}]).
Bind a variable (e.g. M) to matching_fun(v(5)), where:
matching_fun(StrPattern) ->
form_to_term(
str_to_form(
?inject_str(
"fun(MsgFormat) ->
case MsgFormat of
~s ->
true;
_ ->
false
end
end.", [StrPattern])
)
).
str_to_form(MsgFStr) ->
{_, Tokens, _} = erl_scan:string(end_with_period(MsgFStr)),
{_, Exprs} = erl_parse:parse_exprs(Tokens),
hd(Exprs).
end_with_period(String) ->
case lists:last(String) of
$. -> String;
_ -> String ++ "."
end.
Finally, take the user supplied message format (which is given as a string), e.g. MsgFormat = "{a, B, {c, D}}", and do: MsgFormatTerm = form_to_term(fill_vars(str_to_form(MsgFormat))). Then you can M(MsgFormatTerm).
e.g. with user supplied message format = {a, B, {c, D}}, and Pid ! {a, B, C} found in code:
2> weaver_ext:fill_vars({tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]}).
{tuple,24,[{atom,24,a},{string,0,"_"},{string,0,"_"}]}
3> weaver_ext:form_to_term(v(2)).
{a,"_","_"}
4> weaver_ext:term_to_str(v(3)).
"{a,\"_\",\"_\"}"
5> weaver_ext:gsub(v(4), "\"_\"", "_").
"{a,_,_}"
6> M = weaver_ext:matching_fun(v(5)).
#Fun<erl_eval.6.13229925>
7> MsgFormatTerm = weaver_ext:form_to_term(weaver_ext:fill_vars(weaver_ext:str_to_form("{a, B, {c, D}}"))).
{a,"_",{c,"_"}}
8> M(MsgFormatTerm).
true
9> M({a, 10, 20}).
true
10> M({b, "_", 20}).
false
There is functionality for this in erl_types (HiPE).
I'm not sure you have the data in the right form for using this module though. I seem to remember that it takes Erlang terms as input. If you figure out the form issue you should be able to do most what you need with erl_types:t_from_term/1 and erl_types:t_is_subtype/2.
It was a long time ago that I last used these and I only ever did my testing runtime, as opposed to compile time. If you want to take a peek at usage pattern from my old code (not working any more) you can find it available at github.
I don't think this is possible at compile time in the general case. Consider:
send_msg(Pid, Msg) ->
Pid ! Msg.
Msg will look like a a var, which is a completely opaque type. You can't tell if it is a tuple or a list or an atom, since anyone could call this function with anything supplied for Msg.
This would be much easier to do at run time instead. Every time you use the ! operator, you'll need to call a wrapper function instead, which tries to match the message you are trying to send, and executes additional processing if the pattern is matched.
am josh in Uganda. i created a mnesia fragmented table (64 fragments), and managed to populate it upto 9948723 records. Each fragment was a disc_copies type, with two replicas.
Now, using qlc (query list comprehension), was too slow in searching for a record, and was returning inaccurate results.
I found out that this overhead is that qlc uses the select function of mnesia which traverses the entire table in order to match records. i tried something else below.
-define(ACCESS_MOD,mnesia_frag).
-define(DEFAULT_CONTEXT,transaction).
-define(NULL,'_').
-record(address,{tel,zip_code,email}).
-record(person,{name,sex,age,address = #address{}}).
match()-> Z = fun(Spec) -> mnesia:match_object(Spec) end,Z.
match_object(Pattern)->
Match = match(),
mnesia:activity(?DEFAULT_CONTEXT,Match,[Pattern],?ACCESS_MOD).
Trying this functionality gave me good results. But i found that i have to dynamically build patterns for every search that may be made in my stored procedures.
i decided to go through the havoc of doing this, so i wrote functions which will dynamically build wild patterns for my records depending on which parameter is to be searched.
%% This below gives me the default pattern for all searches ::= {person,'_','_','_'}
pattern(Record_name)->
N = length(my_record_info(Record_name)) + 1,
erlang:setelement(1,erlang:make_tuple(N,?NULL),Record_name).
%% this finds the position of the provided value and places it in that
%% position while keeping '_' in the other positions.
%% The caller function can use this function recursively until
%% it has built the full search pattern of interest
pattern({Field,Value},Pattern_sofar)->
N = position(Field,my_record_info(element(1,Pattern_sofar))),
case N of
-1 -> Pattern_sofar;
Int when Int >= 1 -> erlang:setelement(N + 1,Pattern_sofar,Value);
_ -> Pattern_sofar
end.
my_record_info(Record_name)->
case Record_name of
staff_dynamic -> record_info(fields,staff_dynamic);
person -> record_info(fields,person);
_ -> []
end.
%% These below,help locate the position of an element in a list
%% returned by "-record_info(fields,person)"
position(_,[]) -> -1;
position(Value,List)->
find(lists:member(Value,List),Value,List,1).
find(false,_,_,_) -> -1;
find(true,V,[V|_],N)-> N;
find(true,V,[_|X],N)->
find(V,X,N + 1).
find(V,[V|_],N)-> N;
find(V,[_|X],N) -> find(V,X,N + 1).
This was working very well though it was computationally intensive.
It could still work even after changing the record definition since at compile time, it gets the new record info
The problem is that when i initiate even 25 processes on a 3.0 GHz pentium 4 processor running WinXP, It hangs and takes a long time to return results.
If am to use qlc in these fragments, to get accurate results, i have to specify which fragment to search in like this.
find_person_by_tel(Tel)->
select(qlc:q([ X || X <- mnesia:table(Frag), (X#person.address)#address.tel == Tel])).
select(Q)->
case ?transact(fun() -> qlc:e(Q) end) of
{atomic,Val} -> Val;
{aborted,_} = Error -> report_mnesia_event(Error)
end.
Qlc was returning [], when i search for something yet when i use match_object/1 i get accurate results. I found that using match_expressions can help.
mnesia:table(Tab,Props).
where Props is a data structure that defines the match expression, the chunk size of return values e.t.c
I got a problem when i tried building match expressions dynamically.
Function mnesia:read/1 or mnesia:read/2 requires that you have the primary key
Now am asking myself, how can i efficiently use QLC to search for records in a large fragmented table? Please help.
I know that using tuple representation of records makes code hard to upgrade. This is why
i hate using mnesia:select/1, mnesia:match_object/1 and i want to stick to QLC. QLC is giving me wrong results in my queries from a mnesia table of 64 fragments even on the same node.
Has anyone ever used QLC to query a fragmented table?, please help
Do you invoke the qlc in the activity context?
tfn_match(Id) ->
Search = #person{address=#address{tel=Id, _ = '_'}, _ = '_'},
trans(fun() -> mnesia:match_object(Search) end).
tfn_qlc(Id) ->
Q = qlc:q([ X || X <- mnesia:table(person), (X#person.address)#address.tel == Id]),
trans(fun() -> qlc:e(Q) end).
trans(Fun) ->
try Res = mnesia:activity(transaction, Fun, mnesia_frag),
{atomic, Res}
catch exit:Error ->
{aborted, Error}
end.