What is the Erlang way to do stream manipulations? - stream

Suppose I wanted to do something like:
dict
.values()
.map(fun scrub/1)
.flatMap(fun split/1)
.groupBy(fun keyFun/1, fun count/1)
.to_dict()
What is the most elegant way to achieve this in Erlang?

There is no direct easy way of doing that. All attempts I saw looked even worse than straightforward composition. If you will look at majority of open source project in Erlang, you will find that they use generic composition. Re-using your example:
to_dict(
groupBy(fun keyFun/1, fun count/1,
flatMap(fun split/1,
map(fun scrub/1,
values(dict))))).

This isn't a construct that's natural in Erlang. If you have a couple functions, regular composition is what I'd use:
lists:flatten(lists:map(fun (A) ->
do_stuff(A)
end,
generate_list())).
For a longer series of operations, intermediary variables:
Dict = #{hello => world, ...},
Values = maps:values(Dict),
ScrubbedValues = lists:map(fun scrub/1, Values),
SplitValues = lists:flatten(lists:map(fun split/1, ScrubbedValues)),
GroupedValues = basil_lists:group_by(fun keyFun/1, fun count/1, SplitValues),
Dict2 = maps:from_list(GroupedValues).
That's how it'd look if you wanted all of those operations grouped in one shot together.
However, I'd more likely write this in a different way:
-spec remap_values(map()) -> map().
remap_values(Map) ->
map_values(maps:values(Map)).
-spec map_values(list()) -> map().
map_values(Values) ->
map_values(Values, [], []).
-spec map_values(list(), list(), list()) -> map().
map_values([], OutList, OutGroup) ->
%% Base case: transform into a map
Grouped = lists:zip(OutGroup, OutList),
lists:foldl(fun ({Group, Element}, Acc = #{Group := Existing}) ->
Acc#{Group => [Element | Existing]};
({Group, Element}, Acc) ->
Acc#{Group => [Element]}
end,
#{},
Grouped;
map_values([First|Rest], OutList, OutGroup) ->
%% Recursive case: categorize process the first element and categorize the result
Processed = split(scrub(First)),
Categories = lists:map(fun categorize/1, Processed),
map_values(Rest, OutList ++ Processed, OutGroup ++ Categories).
The actual correct implementation depends a lot on how the code's going to be run -- what I've written here is pretty simple, but might not perform well on large amounts of data. If you're actually looking to process an endless stream of data you'll need to write that yourself (though you may find Gen Servers to be a very useful framework for doing so).

Related

Extract values from list of tuples continuously in Erlang

I am learning Erlang from a Ruby background and having some difficulty grasping the thought process. The problem I am trying to solve is the following:
I need to make the same request to an api, each time I receive a unique ID in the response which I need to pass into the next request until there is not ID returned. From each response I need to extract certain data and use it for other things as well.
First get the iterator:
ShardIteratorResponse = kinetic:get_shard_iterator(GetShardIteratorPayload).
{ok,[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89D"...>>}]}
Parse out the shard_iterator..
{_, [{_, ShardIterator}]} = ShardIteratorResponse.
Make the request to kinesis for the streams records...
GetRecordsPayload = [{<<"ShardIterator">>, <<ShardIterator/binary>>}].
[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89DETABlwVV"...>>}]
14> RecordsResponse = kinetic:get_records(GetRecordsPayload).
{ok,[{<<"NextShardIterator">>,
<<"AAAAAAAAAAFy3dnTJYkWr3gq0CGo3hkj1t47ccUS10f5nADQXWkBZaJvVgTMcY+nZ9p4AZCdUYVmr3dmygWjcMdugHLQEg6x"...>>},
{<<"Records">>,
[{[{<<"Data">>,<<"Zmlyc3QgcmVjb3JkISEh">>},
{<<"PartitionKey">>,<<"BlanePartitionKey">>},
{<<"SequenceNumber">>,
<<"49545722516689138064543799042897648239478878787235479554">>}]}]}]}
What I am struggling with is how do I write a loop that keeps hitting the kinesis endpoint for that stream until there are no more shard iterators, aka I want all records. Since I can't re-assign the variables as I would in Ruby.
WARNING: My code might be bugged but it's "close". I've never ran it and don't see how last iterator can look like.
I see you are trying to do your job entirely in shell. It's possible but hard. You can use named function and recursion (since release 17.0 it's easier), for example:
F = fun (ShardIteratorPayload) ->
{_, [{_, ShardIterator}]} = kinetic:get_shard_iterator(ShardIteratorPayload),
FunLoop =
fun Loop(<<>>, Accumulator) -> % no clue how last iterator can look like
lists:reverse(Accumulator);
Loop(ShardIterator, Accumulator) ->
{ok, [{_, NextShardIterator}, {<<"Records">>, Records}]} =
kinetic:get_records([{<<"ShardIterator">>, <<ShardIterator/binary>>}]),
Loop(NextShardIterator, [Records | Accumulator])
end,
FunLoop(ShardIterator, [])
end.
AllRecords = F(GetShardIteratorPayload).
But it's too complicated to type in shell...
It's much easier to code it in modules.
A common pattern in erlang is to spawn another process or processes to fetch your data. To keep it simple you can spawn another process by calling spawn or spawn_link but don't bother with links now and use just spawn/3.
Let's compile simple consumer module:
-module(kinetic_simple_consumer).
-export([start/1]).
start(GetShardIteratorPayload) ->
Pid = spawn(kinetic_simple_fetcher, start, [self(), GetShardIteratorPayload]),
consumer_loop(Pid).
consumer_loop(FetcherPid) ->
receive
{FetcherPid, finished} ->
ok;
{FetcherPid, {records, Records}} ->
consume(Records),
consumer_loop(FetcherPid);
UnexpectedMsg ->
io:format("DROPPING:~n~p~n", [UnexpectedMsg]),
consumer_loop(FetcherPid)
end.
consume(Records) ->
io:format("RECEIVED:~n~p~n",[Records]).
And fetcher:
-module(kinetic_simple_fetcher).
-export([start/2]).
start(ConsumerPid, GetShardIteratorPayload) ->
{ok, [ShardIterator]} = kinetic:get_shard_iterator(GetShardIteratorPayload),
fetcher_loop(ConsumerPid, ShardIterator).
fetcher_loop(ConsumerPid, {_, <<>>}) -> % no clue how last iterator can look like
ConsumerPid ! {self(), finished};
fetcher_loop(ConsumerPid, ShardIterator) ->
{ok, [NextShardIterator, {<<"Records">>, Records}]} =
kinetic:get_records(shard_iterator(ShardIterator)),
ConsumerPid ! {self(), {records, Records}},
fetcher_loop(ConsumerPid, NextShardIterator).
shard_iterator({_, ShardIterator}) ->
[{<<"ShardIterator">>, <<ShardIterator/binary>>}].
As you can see both processes can do their job concurrently.
Try from your shell:
kinetic_simple_consumer:start(GetShardIteratorPayload).
Now your see that your shell process turns to consumer and you will have your shell back after fetcher will send {ItsPid, finished}.
Next time instead of
kinetic_simple_consumer:start(GetShardIteratorPayload).
run:
spawn(kinetic_simple_consumer, start, [GetShardIteratorPayload]).
You should play with spawning processes - it's erlang main strength.
In Erlang, you can write loop using tail recursive functions. I don't know the kinetic API, so for simplicity, I just assume, that kinetic:next_iterator/1 return {ok, NextIterator} or {error, Reason} when there are no more shards.
loop({error, Reason}) ->
ok;
loop({ok, Iterator}) ->
do_something_with(Iterator),
Result = kinetic:next_iterator(Iterator),
loop(Result).
You are replacing loop with iteration. First clause deals with case, where there are no more shards left (always start recursion with the end condition). Second clause deals with case, where we got some iterator, we do something with it and call next.
The recursive call is last instruction in the function body, which is called tail recursion. Erlang optimizes such calls - they don't use call stack, so they can run infinitely in constant memory (you will not get anything like "Stack level too deep")

Creating a valid function declaration from a complex tuple/list structure

Is there a generic way, given a complex object in Erlang, to come up with a valid function declaration for it besides eyeballing it? I'm maintaining some code previously written by someone who was a big fan of giant structures, and it's proving to be error prone doing it manually.
I don't need to iterate the whole thing, just grab the top level, per se.
For example, I'm working on this right now -
[[["SIP",47,"2",46,"0"],32,"407",32,"Proxy Authentication Required","\r\n"],
[{'Via',
[{'via-parm',
{'sent-protocol',"SIP","2.0","UDP"},
{'sent-by',"172.20.10.5","5060"},
[{'via-branch',"z9hG4bKb561e4f03a40c4439ba375b2ac3c9f91.0"}]}]},
{'Via',
[{'via-parm',
{'sent-protocol',"SIP","2.0","UDP"},
{'sent-by',"172.20.10.15","5060"},
[{'via-branch',"12dee0b2f48309f40b7857b9c73be9ac"}]}]},
{'From',
{'from-spec',
{'name-addr',
[[]],
{'SIP-URI',
[{userinfo,{user,"003018CFE4EF"},[]}],
{hostport,"172.20.10.11",[]},
{'uri-parameters',[]},
[]}},
[{tag,"b7226ffa86c46af7bf6e32969ad16940"}]}},
{'To',
{'name-addr',
[[]],
{'SIP-URI',
[{userinfo,{user,"3966"},[]}],
{hostport,"172.20.10.11",[]},
{'uri-parameters',[]},
[]}},
[{tag,"a830c764"}]},
{'Call-ID',"90df0e4968c9a4545a009b1adf268605#172.20.10.15"},
{'CSeq',1358286,"SUBSCRIBE"},
["date",'HCOLON',
["Mon",44,32,["13",32,"Jun",32,"2011"],32,["17",58,"03",58,"55"],32,"GMT"]],
{'Contact',
[[{'name-addr',
[[]],
{'SIP-URI',
[{userinfo,{user,"3ComCallProcessor"},[]}],
{hostport,"172.20.10.11",[]},
{'uri-parameters',[]},
[]}},
[]],
[]]},
["expires",'HCOLON',3600],
["user-agent",'HCOLON',
["3Com",[]],
[['LWS',["VCX",[]]],
['LWS',["7210",[]]],
['LWS',["IP",[]]],
['LWS',["CallProcessor",[['SLASH',"v10.0.8"]]]]]],
["proxy-authenticate",'HCOLON',
["Digest",'LWS',
["realm",'EQUAL',['SWS',34,"3Com",34]],
[['COMMA',["domain",'EQUAL',['SWS',34,"3Com",34]]],
['COMMA',
["nonce",'EQUAL',
['SWS',34,"btbvbsbzbBbAbwbybvbxbCbtbzbubqbubsbqbtbsbqbtbxbCbxbsbybs",
34]]],
['COMMA',["stale",'EQUAL',"FALSE"]],
['COMMA',["algorithm",'EQUAL',"MD5"]]]]],
{'Content-Length',0}],
"\r\n",
["\n"]]
Maybe https://github.com/etrepum/kvc
I noticed your clarifying comment. I'd prefer to add a comment myself, but don't have enough karma. Anyway, the trick I use for that is to experiment in the shell. I'll iterate a pattern against a sample data structure until I've found the simplest form. You can use the _ match-all variable. I use an erlang shell inside an emacs shell window.
First, bind a sample to a variable:
A = [{a,b},[{c,d}, {e,f}]].
Now set the original structure against the variable:
[{a,b},[{c,d},{e,f}]] = A.
If you hit enter, you'll see they match. Hit alt-p (forget what emacs calls alt, but it's alt on my keyboard) to bring back the previous line. Replace some tuple or list item with an underscore:
[_,[{c,d},{e,f}]].
Hit enter to make sure you did it right and they still match. This example is trivial, but for deeply nested, multiline structures it's trickier, so it's handy to be able to just quickly match to test. Sometimes you'll want to try to guess at whole huge swaths, like using an underscore to match a tuple list inside a tuple that's the third element of a list. If you place it right, you can match the whole thing at once, but it's easy to misread it.
Anyway, repeat to explore the essential shape of the structure and place real variables where you want to pull out values:
[_, [_, _]] = A.
[_, _] = A.
[_, MyTupleList] = A. %% let's grab this tuple list
[{MyAtom,b}, [{c,d}, MyTuple]] = A. %% or maybe we want this atom and tuple
That's how I efficiently dissect and pattern match complex data structures.
However, I don't know what you're doing. I'd be inclined to have a wrapper function that uses KVC to pull out exactly what you need and then distributes to helper functions from there for each type of structure.
If I understand you correctly you want to pattern match some large datastructures of unknown formatting.
Example:
Input: {a, b} {a,b,c,d} {a,[],{},{b,c}}
function({A, B}) -> do_something;
function({A, B, C, D}) when is_atom(B) -> do_something_else;
function({A, B, C, D}) when is_list(B) -> more_doing.
The generic answer is of course that it is undecidable from just data to know how to categorize that data.
First you should probably be aware of iolists. They are created by functions such as io_lib:format/2 and in many other places in the code.
One example is that
[["SIP",47,"2",46,"0"],32,"407",32,"Proxy Authentication Required","\r\n"]
will print as
SIP/2.0 407 Proxy Authentication Required
So, I'd start with flattening all those lists, using a function such as
flatten_io(List) when is_list(List) ->
Flat = lists:map(fun flatten_io/1, List),
maybe_flatten(Flat);
flatten_io(Tuple) when is_tuple(Tuple) ->
list_to_tuple([flatten_io(Element) || Element <- tuple_to_list(Tuple)];
flatten_io(Other) -> Other.
maybe_flatten(L) when is_list(L) ->
case lists:all(fun(Ch) when Ch > 0 andalso Ch < 256 -> true;
(List) when is_list(List) ->
lists:all(fun(X) -> X > 0 andalso X < 256 end, List);
(_) -> false
end, L) of
true -> lists:flatten(L);
false -> L
end.
(Caveat: completely untested and quite inefficient. Will also crash for inproper lists, but you shouldn't have those in your data structures anyway.)
On second thought, I can't help you. Any data structure that uses the atom 'COMMA' for a comma in a string should be taken out and shot.
You should be able to flatten those things as well and start to get a view of what you are looking at.
I know that this is not a complete answer. Hope it helps.
Its hard to recommend something for handling this.
Transforming all the structures in a more sane and also more minimal format looks like its worth it. This depends mainly on the similarities in these structures.
Rather than having a special function for each of the 100 there must be some automatic reformatting that can be done, maybe even put the parts in records.
Once you have records its much easier to write functions for it since you don't need to know the actual number of elements in the record. More important: your code won't break when the number of elements changes.
To summarize: make a barrier between your code and the insanity of these structures by somehow sanitizing them by the most generic code possible. It will be probably a mix of generic reformatting with structure speicific stuff.
As an example already visible in this struct: the 'name-addr' tuples look like they have a uniform structure. So you can recurse over your structures (over all elements of tuples and lists) and match for "things" that have a common structure like 'name-addr' and replace these with nice records.
In order to help you eyeballing you can write yourself helper functions along this example:
eyeball(List) when is_list(List) ->
io:format("List with length ~b\n", [length(List)]);
eyeball(Tuple) when is_tuple(Tuple) ->
io:format("Tuple with ~b elements\n", [tuple_size(Tuple)]).
So you would get output like this:
2> eyeball({a,b,c}).
Tuple with 3 elements
ok
3> eyeball([a,b,c]).
List with length 3
ok
expansion of this in a useful tool for your use is left as an exercise. You could handle multiple levels by recursing over the elements and indenting the output.
Use pattern matching and functions that work on lists to extract only what you need.
Look at http://www.erlang.org/doc/man/lists.html:
keyfind, keyreplace, L = [H|T], ...

Querying mnesia Fragmentated Tables using QLC returns wrong results

am josh in Uganda. i created a mnesia fragmented table (64 fragments), and managed to populate it upto 9948723 records. Each fragment was a disc_copies type, with two replicas.
Now, using qlc (query list comprehension), was too slow in searching for a record, and was returning inaccurate results.
I found out that this overhead is that qlc uses the select function of mnesia which traverses the entire table in order to match records. i tried something else below.
-define(ACCESS_MOD,mnesia_frag).
-define(DEFAULT_CONTEXT,transaction).
-define(NULL,'_').
-record(address,{tel,zip_code,email}).
-record(person,{name,sex,age,address = #address{}}).
match()-> Z = fun(Spec) -> mnesia:match_object(Spec) end,Z.
match_object(Pattern)->
Match = match(),
mnesia:activity(?DEFAULT_CONTEXT,Match,[Pattern],?ACCESS_MOD).
Trying this functionality gave me good results. But i found that i have to dynamically build patterns for every search that may be made in my stored procedures.
i decided to go through the havoc of doing this, so i wrote functions which will dynamically build wild patterns for my records depending on which parameter is to be searched.
%% This below gives me the default pattern for all searches ::= {person,'_','_','_'}
pattern(Record_name)->
N = length(my_record_info(Record_name)) + 1,
erlang:setelement(1,erlang:make_tuple(N,?NULL),Record_name).
%% this finds the position of the provided value and places it in that
%% position while keeping '_' in the other positions.
%% The caller function can use this function recursively until
%% it has built the full search pattern of interest
pattern({Field,Value},Pattern_sofar)->
N = position(Field,my_record_info(element(1,Pattern_sofar))),
case N of
-1 -> Pattern_sofar;
Int when Int >= 1 -> erlang:setelement(N + 1,Pattern_sofar,Value);
_ -> Pattern_sofar
end.
my_record_info(Record_name)->
case Record_name of
staff_dynamic -> record_info(fields,staff_dynamic);
person -> record_info(fields,person);
_ -> []
end.
%% These below,help locate the position of an element in a list
%% returned by "-record_info(fields,person)"
position(_,[]) -> -1;
position(Value,List)->
find(lists:member(Value,List),Value,List,1).
find(false,_,_,_) -> -1;
find(true,V,[V|_],N)-> N;
find(true,V,[_|X],N)->
find(V,X,N + 1).
find(V,[V|_],N)-> N;
find(V,[_|X],N) -> find(V,X,N + 1).
This was working very well though it was computationally intensive.
It could still work even after changing the record definition since at compile time, it gets the new record info
The problem is that when i initiate even 25 processes on a 3.0 GHz pentium 4 processor running WinXP, It hangs and takes a long time to return results.
If am to use qlc in these fragments, to get accurate results, i have to specify which fragment to search in like this.
find_person_by_tel(Tel)->
select(qlc:q([ X || X <- mnesia:table(Frag), (X#person.address)#address.tel == Tel])).
select(Q)->
case ?transact(fun() -> qlc:e(Q) end) of
{atomic,Val} -> Val;
{aborted,_} = Error -> report_mnesia_event(Error)
end.
Qlc was returning [], when i search for something yet when i use match_object/1 i get accurate results. I found that using match_expressions can help.
mnesia:table(Tab,Props).
where Props is a data structure that defines the match expression, the chunk size of return values e.t.c
I got a problem when i tried building match expressions dynamically.
Function mnesia:read/1 or mnesia:read/2 requires that you have the primary key
Now am asking myself, how can i efficiently use QLC to search for records in a large fragmented table? Please help.
I know that using tuple representation of records makes code hard to upgrade. This is why
i hate using mnesia:select/1, mnesia:match_object/1 and i want to stick to QLC. QLC is giving me wrong results in my queries from a mnesia table of 64 fragments even on the same node.
Has anyone ever used QLC to query a fragmented table?, please help
Do you invoke the qlc in the activity context?
tfn_match(Id) ->
Search = #person{address=#address{tel=Id, _ = '_'}, _ = '_'},
trans(fun() -> mnesia:match_object(Search) end).
tfn_qlc(Id) ->
Q = qlc:q([ X || X <- mnesia:table(person), (X#person.address)#address.tel == Id]),
trans(fun() -> qlc:e(Q) end).
trans(Fun) ->
try Res = mnesia:activity(transaction, Fun, mnesia_frag),
{atomic, Res}
catch exit:Error ->
{aborted, Error}
end.

How to turn a string with a valid Erlang expression into an abstract syntax tree (AST)?

I would like to convert a string containing a valid Erlang expression to its abstract syntax tree representation, without any success so far.
Below is an example of what I would like to do. After compiling, alling z:z(). generates module zed, which by calling zed:zed(). returns the result of applying lists:reverse on the given list.
-module(z).
-export([z/0]).
z() ->
ModuleAST = erl_syntax:attribute(erl_syntax:atom(module),
[erl_syntax:atom("zed")]),
ExportAST = erl_syntax:attribute(erl_syntax:atom(export),
[erl_syntax:list(
[erl_syntax:arity_qualifier(
erl_syntax:atom("zed"),
erl_syntax:integer(0))])]),
%ListAST = ?(String), % This is where I would put my AST
ListAST = erl_syntax:list([erl_syntax:integer(1), erl_syntax:integer(2)]),
FunctionAST = erl_syntax:function(erl_syntax:atom("zed"),
[erl_syntax:clause(
[], none,
[erl_syntax:application(
erl_syntax:atom(lists),
erl_syntax:atom(reverse),
[ListAST]
)])]),
Forms = [erl_syntax:revert(AST) || AST <- [ModuleAST, ExportAST, FunctionAST]],
case compile:forms(Forms) of
{ok,ModuleName,Binary} -> code:load_binary(ModuleName, "z", Binary);
{ok,ModuleName,Binary,_Warnings} -> code:load_binary(ModuleName, "z", Binary)
end.
String could be "[1,2,3].", or "begin A=4, B=2+3, [A,B] end.", or anything alike.
(Note that this is just an example of what I would like to do, so evaluating String is not an option for me.)
EDIT:
Specifying ListAST as below generates a huge dict-digraph-error-monster, and says "internal error in lint_module".
String = "[1,2,3].",
{ok, Ts, _} = erl_scan:string(String),
{ok, ListAST} = erl_parse:parse_exprs(Ts),
EDIT2:
This solution works for simple terms:
{ok, Ts, _} = erl_scan:string(String),
{ok, Term} = erl_parse:parse_term(Ts),
ListAST = erl_syntax:abstract(Term),
In your EDIT example:
String = "[1,2,3].",
{ok, Ts, _} = erl_scan:string(String),
{ok, ListAST} = erl_parse:parse_exprs(Ts),
the ListAST is actually a list of AST:s (because parse_exprs, as the name indicates, parses multiple expressions (each terminated by a period). Since your string contained a single expression, you got a list of one element. All you need to do is match that out:
{ok, [ListAST]} = erl_parse:parse_exprs(Ts),
so it has nothing to do with erl_syntax (which accepts all erl_parse trees); it's just that you had an extra list wrapper around the ListAST, which caused the compiler to puke.
Some comments of the top of my head.
I have not really used the erl_syntax libraries but I do think they make it difficult to read and "see" what you are trying to build. I would probably import the functions or define my own API to make it shorter and more legible. But then I generally tend to prefer shorter function and variable names.
The AST created by erl_syntax and the "standard" one created by erl_parse and used in the compiler are different and cannot be mixed. So you have to choose one of them and stick with it.
The example in your second EDIT will work for terms but not in the more general case:
{ok, Ts, _} = erl_scan:string(String),
{ok, Term} = erl_parse:parse_term(Ts),
ListAST = erl_syntax:abstract(Term),
This because erl_parse:parse_term/1 returns the actual term represented by the tokens while the other erl_parse functions parse_form and parse_exprs return the ASTs. Putting them into erl_syntax:abstract will do funny things.
Depending on what you are trying to do it might actually be easier to actually write out and erlang file and compile it rather than working directly with the abstract forms. This goes against my ingrained feelings but generating the erlang ASTs is not trivial. What type of code do you intend to produce?
<shameless_plug>
If you are not scared of lists you might try using LFE (lisp flavoured erlang) to generate code as with all lisps there is no special abstract form, it's all homoiconic and much easier to work with.
</shameless_plug>
Zoltan
This is how we get the AST:
11> String = "fun() -> io:format(\"blah~n\") end.".
"fun() -> io:format(\"blah~n\") end."
12> {ok, Tokens, _} = erl_scan:string(String).
{ok,[{'fun',1},
{'(',1},
{')',1},
{'->',1},
{atom,1,io},
{':',1},
{atom,1,format},
{'(',1},
{string,1,"blah~n"},
{')',1},
{'end',1},
{dot,1}],
1}
13> {ok, AbsForm} = erl_parse:parse_exprs(Tokens).
{ok,[{'fun',1,
{clauses,[{clause,1,[],[],
[{call,1,
{remote,1,{atom,1,io},{atom,1,format}},
[{string,1,"blah~n"}]}]}]}}]}
14>

Erlang and run-time record limitations

I'm developing an Erlang system and having reoccurring problems with the fact that records are compile-time pre-processor macros (almost), and that they cant be manipulated at runtime...
basically, I'm working with a property pattern, where properties are added at run-time to objects on the front-end (AS3). Ideally, I would reflect this with a list on the Erlang side, since its a fundamental data type, but then using records in QCL [to query ETS tables] would not be possible since to use them I have to specifically say which record property I want to query over... I have at least 15 columns in the larges table, so listing them all in one huge switch statement (case X of) is just plain ugly.
does anyone have any ideas how to elegantly solve this? maybe some built-in functions for creating tuples with appropriate signatures for use in pattern matching (for QLC)?
thanks
It sounds like you want to be able to do something like get_record_field(Field, SomeRecord) where Field is determined at runtime by user interface code say.
You're right in that you can't do this in standard erlang as records and the record_info function are expanded and eliminated at compile time.
There are a couple of solutions that I've used or looked at. My solution is as follows: (the example gives runtime access to the #dns_rec and #dns_rr records from inet_dns.hrl)
%% Retrieves the value stored in the record Rec in field Field.
info(Field, Rec) ->
Fields = fields(Rec),
info(Field, Fields, tl(tuple_to_list(Rec))).
info(_Field, _Fields, []) -> erlang:error(bad_record);
info(_Field, [], _Rec) -> erlang:error(bad_field);
info(Field, [Field | _], [Val | _]) -> Val;
info(Field, [_Other | Fields], [_Val | Values]) -> info(Field, Fields, Values).
%% The fields function provides the list of field positions
%% for all the kinds of record you want to be able to query
%% at runtime. You'll need to modify this to use your own records.
fields(#dns_rec{}) -> fields(dns_rec);
fields(dns_rec) -> record_info(fields, dns_rec);
fields(#dns_rr{}) -> fields(dns_rr);
fields(dns_rr) -> record_info(fields, dns_rr).
%% Turns a record into a proplist suitable for use with the proplists module.
to_proplist(R) ->
Keys = fields(R),
Values = tl(tuple_to_list(R)),
lists:zip(Keys,Values).
A version of this that compiles is available here: rec_test.erl
You can also extend this dynamic field lookup to dynamic generation of matchspecs for use with ets:select/2 or mnesia:select/2 as shown below:
%% Generates a matchspec that does something like this
%% QLC psuedocode: [ V || #RecordKind{MatchField=V} <- mnesia:table(RecordKind) ]
match(MatchField, RecordKind) ->
MatchTuple = match_tuple(MatchField, RecordKind),
{MatchTuple, [], ['$1']}.
%% Generates a matchspec that does something like this
%% QLC psuedocode: [ T || T <- mnesia:table(RecordKind),
%% T#RecordKind.Field =:= MatchValue]
match(MatchField, MatchValue, RecordKind) ->
MatchTuple = match_tuple(MatchField, RecordKind),
{MatchTuple, [{'=:=', '$1', MatchValue}], ['$$']}.
%% Generates a matchspec that does something like this
%% QLC psuedocode: [ T#RecordKind.ReturnField
%% || T <- mnesia:table(RecordKind),
%% T#RecordKind.MatchField =:= MatchValue]
match(MatchField, MatchValue, RecordKind, ReturnField)
when MatchField =/= ReturnField ->
MatchTuple = list_to_tuple([RecordKind
| [if F =:= MatchField -> '$1'; F =:= ReturnField -> '$2'; true -> '_' end
|| F <- fields(RecordKind)]]),
{MatchTuple, [{'=:=', '$1', MatchValue}], ['$2']}.
match_tuple(MatchField, RecordKind) ->
list_to_tuple([RecordKind
| [if F =:= MatchField -> '$1'; true -> '_' end
|| F <- fields(RecordKind)]]).
Ulf Wiger has also written a parse_transform, Exprecs, that more or less does this for you automagically. I've never tried it, but Ulf's code is usually very good.
I solve this problem (in development) by use the parse transform tools to read the .hrl files and generate helper functions.
I wrote a tutorial on it at Trap Exit.
We use it all the time to generate match specs. The beauty is that you don't need to know anything about the current state of the record at development time.
However once you are in production things change! If your record is the basis of a table (as opposed to the definition of a field in a table) then changing an underlying record is more difficult (to put it mildly!).
I'm not sure I fully understand your Problem but I have moved from records to proplists in most cases. They are much more flexible and much slower. Using (d)ets I usually use a few record fields for coarse selection and then check the proplists on the remaining records for detailed selection.

Resources