Related
I'm trying to save offline ejabberd messages in Riak. I earlier had problems connecting to Riak but those are resolved now with the help of this forum. But now, with my limited Erlang / ejabberd understanding, I'm failing to get the ejabberd packet saved as a string and then put on Riak. Essentially, when the offline_message_hook is latched, I take the Packet argument and then want to put a backslash for each double quote, so that I can then take this revised string and save as a string value on Riak. However I seem to be struggling with modifying the incoming packet to replace the " chars with \".
Is this the right approach? Or am I missing something here in my design? My application relies on the xml format, so should I instead parse the packet using the p1_xml module and reconstruct the xml using the extracted data elements before storing it on Riak.
Apologies for the very basic and multiple questions but appreciate if someone can throw some light here!
The code that I use to try and replace the " with \" in the incoming packet is: (it doesnt quite work):
NewPacket = re:replace(Packet, "\"", "\\\"", [{return, list}, global]),
So essentially, I would be passing the NewPacket as a value to my Riak calls.
ejabberd is compliant with Riak and it is already storing packets in Riak. For example, mod_offline does that.
You can look directly in ejabberd code to find how to do that. For example, in mod_offline, here is how ejabberd store the offline message:
store_offline_msg(Host, {User, _}, Msgs, Len, MaxOfflineMsgs,
riak) ->
Count = if MaxOfflineMsgs =/= infinity ->
Len + count_offline_messages(User, Host);
true -> 0
end,
if
Count > MaxOfflineMsgs ->
discard_warn_sender(Msgs);
true ->
lists:foreach(
fun(#offline_msg{us = US,
timestamp = TS} = M) ->
ejabberd_riak:put(M, offline_msg_schema(),
[{i, TS}, {'2i', [{<<"us">>, US}]}])
end, Msgs)
end.
The code of ejabberd_riak:put/3 is:
put(Rec, RecSchema, IndexInfo) ->
Key = encode_key(proplists:get_value(i, IndexInfo, element(2, Rec))),
SecIdxs = [encode_index_key(K, V) ||
{K, V} <- proplists:get_value('2i', IndexInfo, [])],
Table = element(1, Rec),
Value = encode_record(Rec, RecSchema),
case put_raw(Table, Key, Value, SecIdxs) of
ok ->
ok;
{error, _} = Error ->
log_error(Error, put, [{record, Rec},
{index_info, IndexInfo}]),
Error
end.
put_raw(Table, Key, Value, Indexes) ->
Bucket = make_bucket(Table),
Obj = riakc_obj:new(Bucket, Key, Value, "application/x-erlang-term"),
Obj1 = if Indexes /= [] ->
MetaData = dict:store(<<"index">>, Indexes, dict:new()),
riakc_obj:update_metadata(Obj, MetaData);
true ->
Obj
end,
catch riakc_pb_socket:put(get_random_pid(), Obj1).
You should have already the proper API to do what you want in ejabberd regarding Riak packet storage.
I am learning Erlang from a Ruby background and having some difficulty grasping the thought process. The problem I am trying to solve is the following:
I need to make the same request to an api, each time I receive a unique ID in the response which I need to pass into the next request until there is not ID returned. From each response I need to extract certain data and use it for other things as well.
First get the iterator:
ShardIteratorResponse = kinetic:get_shard_iterator(GetShardIteratorPayload).
{ok,[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89D"...>>}]}
Parse out the shard_iterator..
{_, [{_, ShardIterator}]} = ShardIteratorResponse.
Make the request to kinesis for the streams records...
GetRecordsPayload = [{<<"ShardIterator">>, <<ShardIterator/binary>>}].
[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89DETABlwVV"...>>}]
14> RecordsResponse = kinetic:get_records(GetRecordsPayload).
{ok,[{<<"NextShardIterator">>,
<<"AAAAAAAAAAFy3dnTJYkWr3gq0CGo3hkj1t47ccUS10f5nADQXWkBZaJvVgTMcY+nZ9p4AZCdUYVmr3dmygWjcMdugHLQEg6x"...>>},
{<<"Records">>,
[{[{<<"Data">>,<<"Zmlyc3QgcmVjb3JkISEh">>},
{<<"PartitionKey">>,<<"BlanePartitionKey">>},
{<<"SequenceNumber">>,
<<"49545722516689138064543799042897648239478878787235479554">>}]}]}]}
What I am struggling with is how do I write a loop that keeps hitting the kinesis endpoint for that stream until there are no more shard iterators, aka I want all records. Since I can't re-assign the variables as I would in Ruby.
WARNING: My code might be bugged but it's "close". I've never ran it and don't see how last iterator can look like.
I see you are trying to do your job entirely in shell. It's possible but hard. You can use named function and recursion (since release 17.0 it's easier), for example:
F = fun (ShardIteratorPayload) ->
{_, [{_, ShardIterator}]} = kinetic:get_shard_iterator(ShardIteratorPayload),
FunLoop =
fun Loop(<<>>, Accumulator) -> % no clue how last iterator can look like
lists:reverse(Accumulator);
Loop(ShardIterator, Accumulator) ->
{ok, [{_, NextShardIterator}, {<<"Records">>, Records}]} =
kinetic:get_records([{<<"ShardIterator">>, <<ShardIterator/binary>>}]),
Loop(NextShardIterator, [Records | Accumulator])
end,
FunLoop(ShardIterator, [])
end.
AllRecords = F(GetShardIteratorPayload).
But it's too complicated to type in shell...
It's much easier to code it in modules.
A common pattern in erlang is to spawn another process or processes to fetch your data. To keep it simple you can spawn another process by calling spawn or spawn_link but don't bother with links now and use just spawn/3.
Let's compile simple consumer module:
-module(kinetic_simple_consumer).
-export([start/1]).
start(GetShardIteratorPayload) ->
Pid = spawn(kinetic_simple_fetcher, start, [self(), GetShardIteratorPayload]),
consumer_loop(Pid).
consumer_loop(FetcherPid) ->
receive
{FetcherPid, finished} ->
ok;
{FetcherPid, {records, Records}} ->
consume(Records),
consumer_loop(FetcherPid);
UnexpectedMsg ->
io:format("DROPPING:~n~p~n", [UnexpectedMsg]),
consumer_loop(FetcherPid)
end.
consume(Records) ->
io:format("RECEIVED:~n~p~n",[Records]).
And fetcher:
-module(kinetic_simple_fetcher).
-export([start/2]).
start(ConsumerPid, GetShardIteratorPayload) ->
{ok, [ShardIterator]} = kinetic:get_shard_iterator(GetShardIteratorPayload),
fetcher_loop(ConsumerPid, ShardIterator).
fetcher_loop(ConsumerPid, {_, <<>>}) -> % no clue how last iterator can look like
ConsumerPid ! {self(), finished};
fetcher_loop(ConsumerPid, ShardIterator) ->
{ok, [NextShardIterator, {<<"Records">>, Records}]} =
kinetic:get_records(shard_iterator(ShardIterator)),
ConsumerPid ! {self(), {records, Records}},
fetcher_loop(ConsumerPid, NextShardIterator).
shard_iterator({_, ShardIterator}) ->
[{<<"ShardIterator">>, <<ShardIterator/binary>>}].
As you can see both processes can do their job concurrently.
Try from your shell:
kinetic_simple_consumer:start(GetShardIteratorPayload).
Now your see that your shell process turns to consumer and you will have your shell back after fetcher will send {ItsPid, finished}.
Next time instead of
kinetic_simple_consumer:start(GetShardIteratorPayload).
run:
spawn(kinetic_simple_consumer, start, [GetShardIteratorPayload]).
You should play with spawning processes - it's erlang main strength.
In Erlang, you can write loop using tail recursive functions. I don't know the kinetic API, so for simplicity, I just assume, that kinetic:next_iterator/1 return {ok, NextIterator} or {error, Reason} when there are no more shards.
loop({error, Reason}) ->
ok;
loop({ok, Iterator}) ->
do_something_with(Iterator),
Result = kinetic:next_iterator(Iterator),
loop(Result).
You are replacing loop with iteration. First clause deals with case, where there are no more shards left (always start recursion with the end condition). Second clause deals with case, where we got some iterator, we do something with it and call next.
The recursive call is last instruction in the function body, which is called tail recursion. Erlang optimizes such calls - they don't use call stack, so they can run infinitely in constant memory (you will not get anything like "Stack level too deep")
In most applications, its hard to avoid the need to query large amounts of information which a user wants to browse through. This is what led me to cursors. With mnesia, cursors are implemented using qlc:cursor/1 or qlc:cursor/2. After working with them for a while and facing this problem many times,
11> qlc:next_answers(QC,3).
** exception error: {qlc_cursor_pid_no_longer_exists,<0.59.0>}
in function qlc:next_loop/3 (qlc.erl, line 1359)
12>
It has occured to me that the whole cursor thing has to be within one mnesia transaction: executes as a whole once. like this below
E:\>erl
Eshell V5.9 (abort with ^G)
1> mnesia:start().
ok
2> rd(obj,{key,value}).
obj
3> mnesia:create_table(obj,[{attributes,record_info(fields,obj)}]).
{atomic,ok}
4> Write = fun(Obj) -> mnesia:transaction(fun() -> mnesia:write(Obj) end) end.
#Fun<erl_eval.6.111823515>
5> [Write(#obj{key = N,value = N * 2}) || N <- lists:seq(1,100)],ok.
ok
6> mnesia:transaction(fun() ->
QC = cursor_server:cursor(qlc:q([XX || XX <- mnesia:table(obj)])),
Ans = qlc:next_answers(QC,3),
io:format("\n\tAns: ~p~n",[Ans])
end).
Ans: [{obj,20,40},{obj,21,42},{obj,86,172}]
{atomic,ok}
7>
When you attempt to call say: qlc:next_answers/2 outside a mnesia transaction, you will get an exception. Not only just out of the transaction, but even if that method is executed by a DIFFERENT process than the one which created the cursor, problems are bound to happen.
Another intresting finding is that, as soon as you get out of a mnesia transaction, one of the processes which are involved in a mnesia cursor (apparently mnesia spawns a process in the background), exits, causing the cursor to be invalid. Look at this below:
-module(cursor_server).
-compile(export_all).
cursor(Q)->
case mnesia:is_transaction() of
false ->
F = fun(QH)-> qlc:cursor(QH,[]) end,
mnesia:activity(transaction,F,[Q],mnesia_frag);
true -> qlc:cursor(Q,[])
end.
%% --- End of module -------------------------------------------
Then in shell, i use that method:
7> QC = cursor_server:cursor(qlc:q([XX || XX <- mnesia:table(obj)])).
{qlc_cursor,{<0.59.0>,<0.30.0>}}
8> erlang:is_process_alive(list_to_pid("<0.59.0>")).
false
9> erlang:is_process_alive(list_to_pid("<0.30.0>")).
true
10> self().
<0.30.0>
11> qlc:next_answers(QC,3).
** exception error: {qlc_cursor_pid_no_longer_exists,<0.59.0>}
in function qlc:next_loop/3 (qlc.erl, line 1359)
12>
So, this makes it very Extremely hard to build a web application in which a user needs to browse a particular set of results, group by group say: give him/her the first 20, then next 20 e.t.c. This involves, getting the first results, send them to the web page, then wait for the user to click NEXT then ask qlc:cursor/2 for the next 20 and so on. These operations cannot be done, while hanging inside a mnesia transaction !!! The only possible way, is by spawning a process which will hang there, receiving and sending back next answers as messages and receiving the next_answers requests as messages like this:
-define(CURSOR_TIMEOUT,timer:hours(1)).
%% initial request is made here below
request(PageSize)->
Me = self(),
CursorPid = spawn(?MODULE,cursor_pid,[Me,PageSize]),
receive
{initial_answers,Ans} ->
%% find a way of hidding the Cursor Pid
%% in the page so that the subsequent requests
%% come along with it
{Ans,pid_to_list(CursorPid)}
after ?CURSOR_TIMEOUT -> timedout
end.
cursor_pid(ParentPid,PageSize)->
F = fun(Pid,N)->
QC = cursor_server:cursor(qlc:q([XX || XX <- mnesia:table(obj)])),
Ans = qlc:next_answers(QC,N),
Pid ! {initial_answers,Ans},
receive
{From,{next_answers,Num}} ->
From ! {next_answers,qlc:next_answers(QC,Num)},
%% Problem here ! how to loop back
%% check: Erlang Y-Combinator
delete ->
%% it could have died already, so we be careful here !
try qlc:delete_cursor(QC) of
_ -> ok
catch
_:_ -> ok
end,
exit(normal)
after ?CURSOR_TIMEOUT -> exit(normal)
end
end,
mnesia:activity(transaction,F,[ParentPid,PageSize],mnesia_frag).
next_answers(CursorPid,PageSize)->
list_to_pid(CursorPid) ! {self(),{next_answers,PageSize}},
receive
{next_answers,Ans} ->
{Ans,pid_to_list(CursorPid)}
after ?CURSOR_TIMEOUT -> timedout
end.
That would create a more complex problem of managing process exits, tracking / monitoring e.t.c. I wonder why the mnesia implementers didnot see this !
Now, that brings me to my questions. I have been walking around the web for solutions and you could please check out these links from which the questions arise: mnemosyne, Ulf Wiger's Solution to Cursor Problems, AMNESIA - an RDBMS implementation of mnesia.
1. Does anyone have an idea on how to handle mnesia query cursors in a different way from what is documented, and is worth sharing ?
2. What are the reasons why mnesia implemeters decided to force the cursors within a single transaction: even the calls for the next_answers ?
3. Is there anything, from what i have presented, that i do not understand clearly (other than my bad buggy illustration code - please ignore those) ?
4. AMNESIA (on section 4.7 of the link i gave above), has a good implementation of cursors, because the subsequent calls for the next_answers, do not need to be in the same transaction, NOR by the same process. Would you advise anyone to switch from mnesia to amnesia due to this and also, is this library still supported ?
5. Ulf Wiger , (the author of many erlang libraries esp. GPROC), suggests the use of mnesia:select/4. How would i use it to solve cursor problems in a web application ? NOTE: Please do not advise me to leave mnesia and use something else, because i want to use mnesia for this specific problem. I appreciate your time to read through all this question.
The motivation is that transaction grabs (in your case) read locks.
Locks can not be kept outside of transactions.
If you want, you can run it in a dirty_context, but you loose the
transactional properties, i.e. the table may change between invocations.
make_cursor() ->
QD = qlc:sort(mnesia:table(person, [{traverse, select}])),
mnesia:activity(async_dirty, fun() -> qlc:cursor(QD) end, mnesia_frag).
get_next(Cursor) ->
Get = fun() -> qlc:next_answers(Cursor,5) end,
mnesia:activity(async_dirty, Get, mnesia_frag).
del_cursor(Cursor) ->
qlc:delete_cursor(Cursor).
I think this may help you :
use async_dirty instead of transaction
{Record,Cont}=mnesia:activity(async_dirty, fun mnesia:select/4,[md,[{Match_head,[Guard],[Result]}],Limit,read])
then read next Limit number of records:
mnesia:activity(async_dirty, fun mnesia:select/1,[Cont])
full code:
-record(md,{id,name}).
batch_delete(Id,Limit) ->
Match_head = #md{id='$1',name='$2'},
Guard = {'<','$1',Id},
Result = '$_',
{Record,Cont} = mnesia:activity(async_dirty, fun mnesia:select/4,[md,[{Match_head,[Guard],[Result]}],Limit,read]),
delete_next({Record,Cont}).
delete_next('$end_of_table') ->
over;
delete_next({Record,Cont}) ->
delete(Record),
delete_next(mnesia:activity(async_dirty, fun mnesia:select/1,[Cont])).
delete(Records) ->
io:format("delete(~p)~n",[Records]),
F = fun() ->
[ mnesia:delete_object(O) || O <- Records]
end,
mnesia:transaction(F).
remember you can not use cursor out of one transaction
My scenario is as follows -
I have a client C with function foo() which performs some computation.
I'd like a server S, which doesn't know about foo(), to execute this function instead, and send the result back to the client.
I am trying to determine the best way to perform this in Erlang. I am considering:
Hot code swapping - i.e. "upgrade" code in S such that it has the function foo(). Execute and send back to the client.
In a distributed manner where nodes are all appropriately registered, do something along the lines of S ! C:foo() - for the purpose of "sending" the function to process/node S
Are there other methods (or features of the language) that I am not thinking of?
Thanks for the help!
If the computation function is self contained i.e. does not depend on any other modules or functions on the client C, then what you need to do is a fun (Functional Objects). A fun can be sent across the network and applied by a remote machine and in side the fun, the sender has embedded their address and a way of getting the answer back. So the executor may only see a fun to which they may or may not give an argument, yet inside the fun, the sender has forced a method where by the answer will automatically be sent back. The fun is an abstraction of very many tasks within one thing, and it can be moved around as arguments.
At the client, you can have code like this:
%% somewhere in the client
%% client runs on node() == 'client#domain.com'
-module(client).
-compile(export_all).
-define(SERVER,{server,'server#domain.com'}).
give_a_server_a_job(Number)-> ?SERVER ! {build_fun(),Number}.
build_fun()->
FunObject = fun(Param)->
Answer = Param * 20/1000, %% computation here
rpc:call('client#domain.com',client,answer_ready,[Answer])
end,
FunObject.
answer_ready(Answer)->
%%% use Answer for all sorts of funny things....
io:format("\n\tAnswer is here: ~p~n",[Answer]).
The server then has code like this:
%%% somewhere on the server
%%% server runs on node() == 'server#domain.com'
-module(server).
-compile(export_all).
start()-> register(server,spawn(?MODULE,loop,[])).
loop()->
receive
{Fun,Arg} ->
Fun(Arg), %% server executes job
%% job automatically sends answer back
%% to client
loop();
stop -> exit(normal);
_ -> loop()
end.
In this way, the job executor needs not know about how to send back the reply, The job itself comes knowing how it will send back the answer to however sent the job!. I have used this method of sending functional objects across the network in several project, its so cool !!!
#### EDIT #####
If you have a recursive problem, You manipulate recursion using funs. However, you will need at least one library function at the client and/or the server to assist in recursive manipulations. Create a function which should be in the code path of the client as well as the server.
Another option is to dynamically send code from the server to the client and then using the library: Dynamic Compile erlang to load and execute erlang code at the server from the client. Using dynamic compile, here is an example:
1> String = "-module(add).\n -export([add/2]). \n add(A,B) -> A + B. \n".
"-module(add).\n -export([add/2]). \n add(A,B) -> A + B. \n"
2> dynamic_compile:load_from_string(String).
{module,add}
3> add:add(2,5).
7
4>
What we see above is a piece of module code that is compiled and loaded dynamically from a string. If the library enabling this is available at the server and client , then each entity can send code as a string and its loaded and executed dynamically at the other. This code can be unloaded after use. Lets look at the Fibonacci function and how it can be sent and executed at the server:
%% This is the normal Fibonacci code which we are to convert into a string:
-module(fib).
-export([fib/1]).
fib(N) when N == 0 -> 0;
fib(N) when (N < 3) and (N > 0) -> 1;
fib(N) when N > 0 -> fib(N-1) + fib(N-2).
%% In String format, this would now become this piece of code
StringCode = " -module(fib).\n -export([fib/1]). \nfib(N) when N == 0 -> 0;\n fib(N) when (N < 3) and (N > 0) -> 1;\n fib(N) when N > 0 -> fib(N-1) + fib(N-2). \n".
%% Then the client would send this string above to the server and the server would %% dynamically load the code and execute it
send_fib_code(Arg)->
{ServerRegName,ServerNode} ! {string,StringCode,fib,Arg},
ok.
get_answer({fib,of,This,is,That}) ->
io:format("Fibonacci (from server) of ~p is: ~p~n",[This,That]).
%%% At Server
loop(ServerState)->
receive
{string,StringCode,Fib,Arg} when Fib == fib ->
try dynamic_compile:load_from_string(StringCode) of
{module,AnyMod} ->
Answer = AnyMod:fib(Arg),
%%% send answer back to client
%%% should be asynchronously
%%% as the channels are different & not make
%% client wait
rpc:call('client#domain.com',client,get_answer,[{fib,of,Arg,is,Answer}])
catch
_:_ -> error_logger:error_report(["Failed to Dynamic Compile & Load Module from client"])
end,
loop(ServerState);
_ -> loop(ServerState)
end.
That piece of rough code can show you what am trying to say. However, you remember to unload all un-usable dynamic modules. Also you can a have a way in which the server tries to check wether such a module was loaded already before loading it again. I advise that you donot copy and paste the above code. Look at it and understand it and then write your own version that can do the job. success !!!
If you do S ! C:foo() it will compute on client side function foo/1 from module C and send its result to process S. It doesn't seem like what you want to do. You should do something like:
% In client
call(S, M, F, A) ->
S ! {do, {M, F, A}, self()},
receive
{ok, V} -> V
end.
% In server
loop() ->
receive
{do, {M, F, A}, C} ->
C ! {ok, apply(M, F, A)},
loop()
end.
But in real scenario you would have to do a lot more work e.g. mark your client message to perform selective receive (make_ref/0), catch error in server and send it back to client, monitor server from client to catch server down, add some timeout and so. Look how are gen_server:call/2 and rpc:call/4,5 implemented and it is reason why there is OTP to save you from most of gotcha.
So I've been using Erlang for the last eight hours, and I've spent two of those banging my head against the keyboard trying to figure out the exception error my console keeps returning.
I'm writing a dice program to learn erlang. I want it to be able to call from the console through the erlang interpreter. The program accepts a number of dice, and is supposed to generate a list of values. Each value is supposed to be between one and six.
I won't bore you with the dozens of individual micro-changes I made to try and fix the problem (random engineering) but I'll post my code and the error.
The Source:
-module(dice2).
-export([d6/1]).
d6(1) ->
random:uniform(6);
d6(Numdice) ->
Result = [],
d6(Numdice, [Result]).
d6(0, [Finalresult]) ->
{ok, [Finalresult]};
d6(Numdice, [Result]) ->
d6(Numdice - 1, [random:uniform(6) | Result]).
When I run the program from my console like so...
dice2:d6(1).
...I get a random number between one and six like expected.
However when I run the same function with any number higher than one as an argument I get the following exception...
**exception error: no function clause matching dice2:d6(1, [4|3])
... I know I I don't have a function with matching arguments but I don't know how to write a function with variable arguments, and a variable number of arguments.
I tried modifying the function in question like so....
d6(Numdice, [Result]) ->
Newresult = [random:uniform(6) | Result],
d6(Numdice - 1, Newresult).
... but I got essentially the same error. Anyone know what is going on here?
This is basically a type error. When Result is a list, [Result] is a list with one element. E.g., if your function worked, it would always return a list with one element: Finalresult.
This is what happens (using ==> for "reduces to"):
d6(2) ==> %% Result == []
d6(2, [[]]) ==> %% Result == [], let's say random:uniform(6) gives us 3
d6(1, [3]) ==> %% Result == 3, let's say random:uniform(6) gives us 4
d6(0, [4|3]) ==> %% fails, since [Result] can only match one-element lists
Presumably, you don't want [[]] in the first call, and you don't want Result to be 3 in the third call. So this should fix it:
d6(Numdice) -> Result = [], d6(Numdice, Result). %% or just d6(Numdice, []).
d6(0, Finalresult) -> {ok, Finalresult};
d6(Numdice, Result) -> d6(Numdice - 1, [random:uniform(6) | Result]).
Lesson: if a language is dynamically typed, this doesn't mean you can avoid getting the types correct. On the contrary, it means that the compiler won't help you in doing this as much as it could.