Erlang and run-time record limitations - erlang

I'm developing an Erlang system and having reoccurring problems with the fact that records are compile-time pre-processor macros (almost), and that they cant be manipulated at runtime...
basically, I'm working with a property pattern, where properties are added at run-time to objects on the front-end (AS3). Ideally, I would reflect this with a list on the Erlang side, since its a fundamental data type, but then using records in QCL [to query ETS tables] would not be possible since to use them I have to specifically say which record property I want to query over... I have at least 15 columns in the larges table, so listing them all in one huge switch statement (case X of) is just plain ugly.
does anyone have any ideas how to elegantly solve this? maybe some built-in functions for creating tuples with appropriate signatures for use in pattern matching (for QLC)?
thanks

It sounds like you want to be able to do something like get_record_field(Field, SomeRecord) where Field is determined at runtime by user interface code say.
You're right in that you can't do this in standard erlang as records and the record_info function are expanded and eliminated at compile time.
There are a couple of solutions that I've used or looked at. My solution is as follows: (the example gives runtime access to the #dns_rec and #dns_rr records from inet_dns.hrl)
%% Retrieves the value stored in the record Rec in field Field.
info(Field, Rec) ->
Fields = fields(Rec),
info(Field, Fields, tl(tuple_to_list(Rec))).
info(_Field, _Fields, []) -> erlang:error(bad_record);
info(_Field, [], _Rec) -> erlang:error(bad_field);
info(Field, [Field | _], [Val | _]) -> Val;
info(Field, [_Other | Fields], [_Val | Values]) -> info(Field, Fields, Values).
%% The fields function provides the list of field positions
%% for all the kinds of record you want to be able to query
%% at runtime. You'll need to modify this to use your own records.
fields(#dns_rec{}) -> fields(dns_rec);
fields(dns_rec) -> record_info(fields, dns_rec);
fields(#dns_rr{}) -> fields(dns_rr);
fields(dns_rr) -> record_info(fields, dns_rr).
%% Turns a record into a proplist suitable for use with the proplists module.
to_proplist(R) ->
Keys = fields(R),
Values = tl(tuple_to_list(R)),
lists:zip(Keys,Values).
A version of this that compiles is available here: rec_test.erl
You can also extend this dynamic field lookup to dynamic generation of matchspecs for use with ets:select/2 or mnesia:select/2 as shown below:
%% Generates a matchspec that does something like this
%% QLC psuedocode: [ V || #RecordKind{MatchField=V} <- mnesia:table(RecordKind) ]
match(MatchField, RecordKind) ->
MatchTuple = match_tuple(MatchField, RecordKind),
{MatchTuple, [], ['$1']}.
%% Generates a matchspec that does something like this
%% QLC psuedocode: [ T || T <- mnesia:table(RecordKind),
%% T#RecordKind.Field =:= MatchValue]
match(MatchField, MatchValue, RecordKind) ->
MatchTuple = match_tuple(MatchField, RecordKind),
{MatchTuple, [{'=:=', '$1', MatchValue}], ['$$']}.
%% Generates a matchspec that does something like this
%% QLC psuedocode: [ T#RecordKind.ReturnField
%% || T <- mnesia:table(RecordKind),
%% T#RecordKind.MatchField =:= MatchValue]
match(MatchField, MatchValue, RecordKind, ReturnField)
when MatchField =/= ReturnField ->
MatchTuple = list_to_tuple([RecordKind
| [if F =:= MatchField -> '$1'; F =:= ReturnField -> '$2'; true -> '_' end
|| F <- fields(RecordKind)]]),
{MatchTuple, [{'=:=', '$1', MatchValue}], ['$2']}.
match_tuple(MatchField, RecordKind) ->
list_to_tuple([RecordKind
| [if F =:= MatchField -> '$1'; true -> '_' end
|| F <- fields(RecordKind)]]).
Ulf Wiger has also written a parse_transform, Exprecs, that more or less does this for you automagically. I've never tried it, but Ulf's code is usually very good.

I solve this problem (in development) by use the parse transform tools to read the .hrl files and generate helper functions.
I wrote a tutorial on it at Trap Exit.
We use it all the time to generate match specs. The beauty is that you don't need to know anything about the current state of the record at development time.
However once you are in production things change! If your record is the basis of a table (as opposed to the definition of a field in a table) then changing an underlying record is more difficult (to put it mildly!).

I'm not sure I fully understand your Problem but I have moved from records to proplists in most cases. They are much more flexible and much slower. Using (d)ets I usually use a few record fields for coarse selection and then check the proplists on the remaining records for detailed selection.

Related

What is the Erlang way to do stream manipulations?

Suppose I wanted to do something like:
dict
.values()
.map(fun scrub/1)
.flatMap(fun split/1)
.groupBy(fun keyFun/1, fun count/1)
.to_dict()
What is the most elegant way to achieve this in Erlang?
There is no direct easy way of doing that. All attempts I saw looked even worse than straightforward composition. If you will look at majority of open source project in Erlang, you will find that they use generic composition. Re-using your example:
to_dict(
groupBy(fun keyFun/1, fun count/1,
flatMap(fun split/1,
map(fun scrub/1,
values(dict))))).
This isn't a construct that's natural in Erlang. If you have a couple functions, regular composition is what I'd use:
lists:flatten(lists:map(fun (A) ->
do_stuff(A)
end,
generate_list())).
For a longer series of operations, intermediary variables:
Dict = #{hello => world, ...},
Values = maps:values(Dict),
ScrubbedValues = lists:map(fun scrub/1, Values),
SplitValues = lists:flatten(lists:map(fun split/1, ScrubbedValues)),
GroupedValues = basil_lists:group_by(fun keyFun/1, fun count/1, SplitValues),
Dict2 = maps:from_list(GroupedValues).
That's how it'd look if you wanted all of those operations grouped in one shot together.
However, I'd more likely write this in a different way:
-spec remap_values(map()) -> map().
remap_values(Map) ->
map_values(maps:values(Map)).
-spec map_values(list()) -> map().
map_values(Values) ->
map_values(Values, [], []).
-spec map_values(list(), list(), list()) -> map().
map_values([], OutList, OutGroup) ->
%% Base case: transform into a map
Grouped = lists:zip(OutGroup, OutList),
lists:foldl(fun ({Group, Element}, Acc = #{Group := Existing}) ->
Acc#{Group => [Element | Existing]};
({Group, Element}, Acc) ->
Acc#{Group => [Element]}
end,
#{},
Grouped;
map_values([First|Rest], OutList, OutGroup) ->
%% Recursive case: categorize process the first element and categorize the result
Processed = split(scrub(First)),
Categories = lists:map(fun categorize/1, Processed),
map_values(Rest, OutList ++ Processed, OutGroup ++ Categories).
The actual correct implementation depends a lot on how the code's going to be run -- what I've written here is pretty simple, but might not perform well on large amounts of data. If you're actually looking to process an endless stream of data you'll need to write that yourself (though you may find Gen Servers to be a very useful framework for doing so).

is_proplist in erlang?

How can get the type of a list. I want to execute the code if the list is proplist.
Let us say L = [a,1,b,2,c,3, ...]. Is the list L, I'm converting it to proplist like
L = [{a,1}, {b,2}, {c,3}, ...].
How can I determine whether the list is a proplist? erlang:is_list/1 is not useful for me.
You can use something like:
is_proplist([]) -> true;
is_proplist([{K,_}|L]) when is_atom(K) -> is_proplist(L);
is_proplist(_) -> false.
but necessary to consider that this function cannot be used in guards.
You'd need to check whether every element of the list is a tuple of two elements. That can be done with lists:all/2:
is_proplist(List) ->
is_list(List) andalso
lists:all(fun({_, _}) -> true;
(_) -> false
end,
List).
This depends on which definition of "proplist" you use, of course. The above is what is usually meant by "proplist", but the documentation for the proplists module says:
Property lists are ordinary lists containing entries in the form of either tuples, whose first elements are keys used for lookup and insertion, or atoms, which work as shorthand for tuples {Atom, true}.

How to provide value and get a Key back

So I have made 2 databases:
Db1 that contains: [{james,london}]
Db2 that contains: [{james,london},{fredrik,berlin},{fred,berlin}]
I have a match function that looks like this:
match(Element, Db) -> proplists:lookup_all(Element, Db).
When I do: match(berlin, Db2) I get: [ ]
What I am trying to get is a way to input the value and get back the keys in this way: [fredrik,fred]
Regarding to documentation proplists:lookup_all works other way:
Returns the list of all entries associated with Key in List.
So, you can lookup only by keys:
(kilter#127.0.0.1)1> Db = [{james,london},{fredrik,berlin},{fred,berlin}].
[{james,london},{fredrik,berlin},{fred,berlin}]
(kilter#127.0.0.1)2> proplists:lookup_all(berlin, Db).
[]
(kilter#127.0.0.1)3> proplists:lookup_all(fredrik, Db).
[{fredrik,berlin}]
You can use lists:filter and lists:map instead:
(kilter#127.0.0.1)7> lists:filter(fun ({K, V}) -> V =:= berlin end, Db).
[{fredrik,berlin},{fred,berlin}]
(kilter#127.0.0.1)8> lists:map(fun ({K,V}) -> K end, lists:filter(fun ({K, V}) -> V =:= berlin end, Db)).
[fredrik,fred]
So, finally
match(Element, Db) -> lists:map(
fun ({K,V}) -> K end,
lists:filter(fun ({K, V}) -> V =:= Element end, Db)
).
proplists:lookup_all/2 takes as a first argument a key; in your example, berlin is a value and it's not a key therefore an empty list is returned.
Naturally, you can use recursion and find all the elements (meaning that you will use it like an ordinary list and not a proplist).
Another solution is to change the encoding scheme:
[{london,james},{berlin,fredrik},{berlin,fred}]
and then use proplists:lookup_all/2
The correct way to encode it depends on the way you will access the data (what kind of "queries" you will perform most); but unless you manipulate large amounts of data (in which case you might want to use some other datastructure) it isn't really worth analyzing.

Creating a valid function declaration from a complex tuple/list structure

Is there a generic way, given a complex object in Erlang, to come up with a valid function declaration for it besides eyeballing it? I'm maintaining some code previously written by someone who was a big fan of giant structures, and it's proving to be error prone doing it manually.
I don't need to iterate the whole thing, just grab the top level, per se.
For example, I'm working on this right now -
[[["SIP",47,"2",46,"0"],32,"407",32,"Proxy Authentication Required","\r\n"],
[{'Via',
[{'via-parm',
{'sent-protocol',"SIP","2.0","UDP"},
{'sent-by',"172.20.10.5","5060"},
[{'via-branch',"z9hG4bKb561e4f03a40c4439ba375b2ac3c9f91.0"}]}]},
{'Via',
[{'via-parm',
{'sent-protocol',"SIP","2.0","UDP"},
{'sent-by',"172.20.10.15","5060"},
[{'via-branch',"12dee0b2f48309f40b7857b9c73be9ac"}]}]},
{'From',
{'from-spec',
{'name-addr',
[[]],
{'SIP-URI',
[{userinfo,{user,"003018CFE4EF"},[]}],
{hostport,"172.20.10.11",[]},
{'uri-parameters',[]},
[]}},
[{tag,"b7226ffa86c46af7bf6e32969ad16940"}]}},
{'To',
{'name-addr',
[[]],
{'SIP-URI',
[{userinfo,{user,"3966"},[]}],
{hostport,"172.20.10.11",[]},
{'uri-parameters',[]},
[]}},
[{tag,"a830c764"}]},
{'Call-ID',"90df0e4968c9a4545a009b1adf268605#172.20.10.15"},
{'CSeq',1358286,"SUBSCRIBE"},
["date",'HCOLON',
["Mon",44,32,["13",32,"Jun",32,"2011"],32,["17",58,"03",58,"55"],32,"GMT"]],
{'Contact',
[[{'name-addr',
[[]],
{'SIP-URI',
[{userinfo,{user,"3ComCallProcessor"},[]}],
{hostport,"172.20.10.11",[]},
{'uri-parameters',[]},
[]}},
[]],
[]]},
["expires",'HCOLON',3600],
["user-agent",'HCOLON',
["3Com",[]],
[['LWS',["VCX",[]]],
['LWS',["7210",[]]],
['LWS',["IP",[]]],
['LWS',["CallProcessor",[['SLASH',"v10.0.8"]]]]]],
["proxy-authenticate",'HCOLON',
["Digest",'LWS',
["realm",'EQUAL',['SWS',34,"3Com",34]],
[['COMMA',["domain",'EQUAL',['SWS',34,"3Com",34]]],
['COMMA',
["nonce",'EQUAL',
['SWS',34,"btbvbsbzbBbAbwbybvbxbCbtbzbubqbubsbqbtbsbqbtbxbCbxbsbybs",
34]]],
['COMMA',["stale",'EQUAL',"FALSE"]],
['COMMA',["algorithm",'EQUAL',"MD5"]]]]],
{'Content-Length',0}],
"\r\n",
["\n"]]
Maybe https://github.com/etrepum/kvc
I noticed your clarifying comment. I'd prefer to add a comment myself, but don't have enough karma. Anyway, the trick I use for that is to experiment in the shell. I'll iterate a pattern against a sample data structure until I've found the simplest form. You can use the _ match-all variable. I use an erlang shell inside an emacs shell window.
First, bind a sample to a variable:
A = [{a,b},[{c,d}, {e,f}]].
Now set the original structure against the variable:
[{a,b},[{c,d},{e,f}]] = A.
If you hit enter, you'll see they match. Hit alt-p (forget what emacs calls alt, but it's alt on my keyboard) to bring back the previous line. Replace some tuple or list item with an underscore:
[_,[{c,d},{e,f}]].
Hit enter to make sure you did it right and they still match. This example is trivial, but for deeply nested, multiline structures it's trickier, so it's handy to be able to just quickly match to test. Sometimes you'll want to try to guess at whole huge swaths, like using an underscore to match a tuple list inside a tuple that's the third element of a list. If you place it right, you can match the whole thing at once, but it's easy to misread it.
Anyway, repeat to explore the essential shape of the structure and place real variables where you want to pull out values:
[_, [_, _]] = A.
[_, _] = A.
[_, MyTupleList] = A. %% let's grab this tuple list
[{MyAtom,b}, [{c,d}, MyTuple]] = A. %% or maybe we want this atom and tuple
That's how I efficiently dissect and pattern match complex data structures.
However, I don't know what you're doing. I'd be inclined to have a wrapper function that uses KVC to pull out exactly what you need and then distributes to helper functions from there for each type of structure.
If I understand you correctly you want to pattern match some large datastructures of unknown formatting.
Example:
Input: {a, b} {a,b,c,d} {a,[],{},{b,c}}
function({A, B}) -> do_something;
function({A, B, C, D}) when is_atom(B) -> do_something_else;
function({A, B, C, D}) when is_list(B) -> more_doing.
The generic answer is of course that it is undecidable from just data to know how to categorize that data.
First you should probably be aware of iolists. They are created by functions such as io_lib:format/2 and in many other places in the code.
One example is that
[["SIP",47,"2",46,"0"],32,"407",32,"Proxy Authentication Required","\r\n"]
will print as
SIP/2.0 407 Proxy Authentication Required
So, I'd start with flattening all those lists, using a function such as
flatten_io(List) when is_list(List) ->
Flat = lists:map(fun flatten_io/1, List),
maybe_flatten(Flat);
flatten_io(Tuple) when is_tuple(Tuple) ->
list_to_tuple([flatten_io(Element) || Element <- tuple_to_list(Tuple)];
flatten_io(Other) -> Other.
maybe_flatten(L) when is_list(L) ->
case lists:all(fun(Ch) when Ch > 0 andalso Ch < 256 -> true;
(List) when is_list(List) ->
lists:all(fun(X) -> X > 0 andalso X < 256 end, List);
(_) -> false
end, L) of
true -> lists:flatten(L);
false -> L
end.
(Caveat: completely untested and quite inefficient. Will also crash for inproper lists, but you shouldn't have those in your data structures anyway.)
On second thought, I can't help you. Any data structure that uses the atom 'COMMA' for a comma in a string should be taken out and shot.
You should be able to flatten those things as well and start to get a view of what you are looking at.
I know that this is not a complete answer. Hope it helps.
Its hard to recommend something for handling this.
Transforming all the structures in a more sane and also more minimal format looks like its worth it. This depends mainly on the similarities in these structures.
Rather than having a special function for each of the 100 there must be some automatic reformatting that can be done, maybe even put the parts in records.
Once you have records its much easier to write functions for it since you don't need to know the actual number of elements in the record. More important: your code won't break when the number of elements changes.
To summarize: make a barrier between your code and the insanity of these structures by somehow sanitizing them by the most generic code possible. It will be probably a mix of generic reformatting with structure speicific stuff.
As an example already visible in this struct: the 'name-addr' tuples look like they have a uniform structure. So you can recurse over your structures (over all elements of tuples and lists) and match for "things" that have a common structure like 'name-addr' and replace these with nice records.
In order to help you eyeballing you can write yourself helper functions along this example:
eyeball(List) when is_list(List) ->
io:format("List with length ~b\n", [length(List)]);
eyeball(Tuple) when is_tuple(Tuple) ->
io:format("Tuple with ~b elements\n", [tuple_size(Tuple)]).
So you would get output like this:
2> eyeball({a,b,c}).
Tuple with 3 elements
ok
3> eyeball([a,b,c]).
List with length 3
ok
expansion of this in a useful tool for your use is left as an exercise. You could handle multiple levels by recursing over the elements and indenting the output.
Use pattern matching and functions that work on lists to extract only what you need.
Look at http://www.erlang.org/doc/man/lists.html:
keyfind, keyreplace, L = [H|T], ...

Querying mnesia Fragmentated Tables using QLC returns wrong results

am josh in Uganda. i created a mnesia fragmented table (64 fragments), and managed to populate it upto 9948723 records. Each fragment was a disc_copies type, with two replicas.
Now, using qlc (query list comprehension), was too slow in searching for a record, and was returning inaccurate results.
I found out that this overhead is that qlc uses the select function of mnesia which traverses the entire table in order to match records. i tried something else below.
-define(ACCESS_MOD,mnesia_frag).
-define(DEFAULT_CONTEXT,transaction).
-define(NULL,'_').
-record(address,{tel,zip_code,email}).
-record(person,{name,sex,age,address = #address{}}).
match()-> Z = fun(Spec) -> mnesia:match_object(Spec) end,Z.
match_object(Pattern)->
Match = match(),
mnesia:activity(?DEFAULT_CONTEXT,Match,[Pattern],?ACCESS_MOD).
Trying this functionality gave me good results. But i found that i have to dynamically build patterns for every search that may be made in my stored procedures.
i decided to go through the havoc of doing this, so i wrote functions which will dynamically build wild patterns for my records depending on which parameter is to be searched.
%% This below gives me the default pattern for all searches ::= {person,'_','_','_'}
pattern(Record_name)->
N = length(my_record_info(Record_name)) + 1,
erlang:setelement(1,erlang:make_tuple(N,?NULL),Record_name).
%% this finds the position of the provided value and places it in that
%% position while keeping '_' in the other positions.
%% The caller function can use this function recursively until
%% it has built the full search pattern of interest
pattern({Field,Value},Pattern_sofar)->
N = position(Field,my_record_info(element(1,Pattern_sofar))),
case N of
-1 -> Pattern_sofar;
Int when Int >= 1 -> erlang:setelement(N + 1,Pattern_sofar,Value);
_ -> Pattern_sofar
end.
my_record_info(Record_name)->
case Record_name of
staff_dynamic -> record_info(fields,staff_dynamic);
person -> record_info(fields,person);
_ -> []
end.
%% These below,help locate the position of an element in a list
%% returned by "-record_info(fields,person)"
position(_,[]) -> -1;
position(Value,List)->
find(lists:member(Value,List),Value,List,1).
find(false,_,_,_) -> -1;
find(true,V,[V|_],N)-> N;
find(true,V,[_|X],N)->
find(V,X,N + 1).
find(V,[V|_],N)-> N;
find(V,[_|X],N) -> find(V,X,N + 1).
This was working very well though it was computationally intensive.
It could still work even after changing the record definition since at compile time, it gets the new record info
The problem is that when i initiate even 25 processes on a 3.0 GHz pentium 4 processor running WinXP, It hangs and takes a long time to return results.
If am to use qlc in these fragments, to get accurate results, i have to specify which fragment to search in like this.
find_person_by_tel(Tel)->
select(qlc:q([ X || X <- mnesia:table(Frag), (X#person.address)#address.tel == Tel])).
select(Q)->
case ?transact(fun() -> qlc:e(Q) end) of
{atomic,Val} -> Val;
{aborted,_} = Error -> report_mnesia_event(Error)
end.
Qlc was returning [], when i search for something yet when i use match_object/1 i get accurate results. I found that using match_expressions can help.
mnesia:table(Tab,Props).
where Props is a data structure that defines the match expression, the chunk size of return values e.t.c
I got a problem when i tried building match expressions dynamically.
Function mnesia:read/1 or mnesia:read/2 requires that you have the primary key
Now am asking myself, how can i efficiently use QLC to search for records in a large fragmented table? Please help.
I know that using tuple representation of records makes code hard to upgrade. This is why
i hate using mnesia:select/1, mnesia:match_object/1 and i want to stick to QLC. QLC is giving me wrong results in my queries from a mnesia table of 64 fragments even on the same node.
Has anyone ever used QLC to query a fragmented table?, please help
Do you invoke the qlc in the activity context?
tfn_match(Id) ->
Search = #person{address=#address{tel=Id, _ = '_'}, _ = '_'},
trans(fun() -> mnesia:match_object(Search) end).
tfn_qlc(Id) ->
Q = qlc:q([ X || X <- mnesia:table(person), (X#person.address)#address.tel == Id]),
trans(fun() -> qlc:e(Q) end).
trans(Fun) ->
try Res = mnesia:activity(transaction, Fun, mnesia_frag),
{atomic, Res}
catch exit:Error ->
{aborted, Error}
end.

Resources