How to pass extra arguments to RabbitMQ connection in Erlang client - erlang

I have written some extension modules for eJabberd most of which pass pieces of information to RabbitMQ for various reasons. All has been fine until we brought the server up in staging where we have a Rabbit cluster rather than a single box.
In order to utilize the cluster you need to pass "x-ha-policy" parameter to Rabbit with either the "all" or "nodes" value. This works fine for the Java and Python Producers and Consumers, but the eJabberd (using the Erlang AMQP client of course) has me a bit stumped. The x-ha-policy parameter needs to be passed into the "client_properties" parameter which is just the "catchall" for extra parameters.
In Python with pika I can do:
client_params = {"x-ha-policy": "all"}
queue.declare(host, vhost, username, password, arguments=client_params)
and that works. However the doc for the Erlang client says the arguments should be passed in as a list per:
[{binary(), atom(), binary()}]
If it were just [{binary(), binary()}] I could see the relationship with key/value but not sure what the atom would be there.
Just to be clear, I am a novice Erlang programmer so this may be a common construct that I am not familiar with, so no answer would be too obvious.

I found this in amqp_network_connection.erl, which looks like a wrapper to set some default values:
client_properties(UserProperties) ->
{ok, Vsn} = application:get_key(amqp_client, vsn),
Default = [{<<"product">>, longstr, <<"RabbitMQ">>},
{<<"version">>, longstr, list_to_binary(Vsn)},
{<<"platform">>, longstr, <<"Erlang">>},
{<<"copyright">>, longstr,
<<"Copyright (c) 2007-2012 VMware, Inc.">>},
{<<"information">>, longstr,
<<"Licensed under the MPL. "
"See http://www.rabbitmq.com/">>},
{<<"capabilities">>, table, ?CLIENT_CAPABILITIES}],
lists:foldl(fun({K, _, _} = Tuple, Acc) ->
lists:keystore(K, 1, Acc, Tuple)
end, Default, UserProperties).
Apparently the atom describes the value type. I don't know the available types, but there's a chance that longstr will work in your case.

Related

Erlang: Functions work in shell but not in YAWS

My sole method of debugging (io:format/2) is not working in YAWS. I'm at a loss. My supervisor starts three processes: ETS Manager, YAWS Init, and Ratelimiter. This is successful. I can play around with the rate limiter in the shell... calling the same functions YAWS should be. The difference being the shell behaves as I would expect and I have no idea what is happening in YAWS.
I do know if I spam the command in shell: ratelimiter:limit(IP) it will return true eventually. I can execute the following and it will also return true: ratelimiter:lockout(IP), ratelimiter:blacklist(IP). The limiter is a gen_server.
The functions do the following:
limit/1: Check ETS table if counter > threshold; update counter. If counter > blacklist threshold make entry in mnesia table
blacklist/1: Check mnesia table if entry exists; Yes: reset timer
lockout/1: Immediately enters ID into mnesia table
In my arg_rewrite_mod module I'm doing some checks to ensure I'm getting the HTTP requests I expect, namely GET, POST, and HEAD. I thought this would be a nice place to also do the rate limiting. Do it as soon as possible in the web server's chain of events.
All the changes I've made to the arg_rewrite module seem to work except using "printf"s and the limiter. I'm new to the language so I'm not sure my mistake is obvious or not.
Skeleton of my arg_rewrite_mod:
-module(arg_preproc).
-export([arg_rewrite/1]).
-include("limiter_def.hrl").
-include_lib("/usr/lib/yaws/include/yaws_api.hrl").
is_blacklisted(ID) ->
case ratelimiter:blacklist(ID) of
false -> continue;
true -> throw(blacklist)
end.
is_limited(ID) ->
case ratelimiter:limit(ID) of
false -> continue;
true -> throw(limit)
end.
arg_rewrite(A) ->
Allow = ['GET','POST', 'HEAD'],
try
{IP, _} = A#arg.client_ip_port,
ID = IP,
is_blacklisted(ID),
io:format("~p ~p ~n",[ID, is_blacklisted(ID)]),
%% === Allow expected HTTP requests
HttpReq = (A#arg.req)#http_request.method,
case lists:member(HttpReq, Allow) of
true ->
{_,ReqTgt} = (A#arg.req)#http_request.path,
PassThru = [".css",".jpg",".jpeg",".png",".js"],
%% ... much more ...
false ->
is_limited(ID),
throw(http_method_denied)
end
catch
throw:blacklist -> %% Send back a 429;
throw:limit -> %% Same but no Retry-After;
throw:http_method_denied ->
%%Only thrown experienced
AllowedReq = string:join([atom_to_list(M) || M <- Allow], ","),
A#arg{state=#rewrite_response{status=405,
headers=[{header, {"Allow", AllowedReq}},{header, {connection, "close"}}]
}};
Type:Reason -> {error, {unhandled,{Type, Reason}}}
end.
I can spam curl -I -X HEAD <<any page>> as fast as I can in a bash shell and all I get is HTTP 200. The ETS table has zero entries as well. Using PUT I get a HTTP 405 as intended. I can ratelimiter:lockout({MY_IP}) and get the web page to load in my browser and a HTTP 200 with curl.
I'm confused. Is it the way I started YAWS?
start() ->
os:putenv("YAWSHOME", ?HOMEPATH_YAWS),
code:add_patha(?MODPATH_YAWS),
ok = case (R = application:start(yaws)) of
{error, {already_started, _}} -> ok;
_ -> R
end,
{ok,self()}. %% Tell supervisor everything okay in a manner it expects.
I did this because I thought it would be "easier."
When starting Yaws as part of another application, it's important to use its embedding support. One important thing the Yaws embedding startup code does is set the application environment variable embedded to true:
application:set_env(yaws, embedded, true),
Yaws checks this variable in several of its code paths, especially during initialization, in order to avoid assuming that it's running as a stand-alone daemon process.
Regarding rate limiting, rather than using an arg rewriter, you might consider using a shaper. The yaws_shaper module provides a behavior that expects its callback module to implement two functions:
check/1: yaws_shaper calls this to allow the callback module to decide whether to allow the request from the client. It passes client host information as the callback argument. Your shaper callback module returns either the atom allow to allow the request to proceed, or the tuple {deny, Status, Message} where Status is an HTTP status code to return to the client, such as 429 to indicate the client is making too many requests, and Message is any extra HTML to be returned to the client. (It might be nice if Message could include a reply header such as Retry-After as well; this is something I'll consider adding to Yaws.)
update/3: yaws_shaper calls this when the response for a client is ready to be returned. The first argument is the client host information, the second argument is the number of "hits" (the value 1 for each request), and the third argument is the number of bytes being delivered in response to the client's request. Your shaper callback module can return ok from update/3 (Yaws does not use the return value).
A shaper can use this framework to track how many requests each client is making and how much data Yaws is delivering to each client, and use that information to limit or deny particular clients.
And finally, while "printf debugging" works, it's less than ideal especially in Erlang, which has built-in tracing. You should consider learning the dbg module so you can trace any function you want, see who called it, see what arguments are being passed to it, see what it returns, etc.

Multiple gen_esme sessions

Trust you all are doing well.
I am trying to make multiple sessions to SMSC using OSERL application.
Since to make a SMPP client you need to inherit gen_esme behaviour.
I was wondering whether it is possible to make multiple connections towards SMSC without writing multiple gen_esme modules?
There are two main strategies for starting multiple processes using the same gen_esme based module:
gen_esme:start_link/4 - named or reference based server
gen_esme:start_link/3 - pid based server
I'm going to be referencing the sample_esme file found under the examples for oserl.
Named Server
Most of the examples from oserl show usage of gen_esme:start_link/4 which in turn is calling gen_server:start_link/4. The ServerName variable for gen_server:start_link/4 has a typespec of {local, Name::atom()} | {global, GlobalName::term()} | {via, Module::atom(), ViaName::term()}.
That means if we change the sample_esme:start_link/0,1,2 functions to look like this:
start_link() ->
start_link(?MODULE).
start_link(SrvName) ->
start_link(SrvName, true).
start_link(SrvName, Silent) ->
Opts = [{rps, 1}, {queue_file, "./sample_esme.dqueue"}],
gen_esme:start_link({local, SrvName}, ?MODULE, [Silent], Opts).
We can start multiple servers using:
sample_esme:start_link(). %% SrvName = 'sample_esme'
sample_esme:start_link(my_client1). %% SrvName = 'my_client1'
sample_esme:start_link(my_client2). %% SrvName = 'my_client2'
To make our sample_esme module work properly with this named server strategy, most of its calling functions will need to be modified. Let's use sample_esme:rps/0,1 as an example:
rps() ->
rps(?MODULE).
rps(SrvName) ->
gen_esme:rps(SrvName).
Now we can call the gen_esme:rps/1 function on any of our running servers:
sample_esme:rps(). %% calls 'sample_esme'
sample_esme:rps(my_client1). %% 'my_client1'
sample_esme:rps(my_client2). %% 'my_client2'
This is similar to how projects like pooler reference members of pools it creates.
pid Server
This is essentially the same as the Named Server strategy, but we're just going to pass the pid of the server around instead of a registered atom.
That means if we change the sample_esme:start_link/0,1 functions to look like this:
start_link() ->
start_link(true).
start_link(Silent) ->
Opts = [{rps, 1}, {queue_file, "./sample_esme.dqueue"}],
gen_esme:start_link(?MODULE, [Silent], Opts).
Notice that all we did was drop the {local, SrvName} argument so it won't register the SrvName atom with the server's pid.
That means we need to capture the pid of each created server:
{ok, Pid0} = sample_esme:start_link().
{ok, Pid1} = sample_esme:start_link().
{ok, Pid2} = sample_esme:start_link().
Using the same sample_esme:rps/0,1 example from Named Server, we will need to remove sample_esme:rps/0 and add a sample_esme:rps/1 function which takes a pid:
rps(SrvPid) ->
gen_esme:rps(SrvPid).
Now we can call the gen_esme:rps/1 function on any of our running servers:
sample_esme:rps(Pid0).
sample_esme:rps(Pid1).
sample_esme:rps(Pid2).
This is similar to how projects like poolboy reference members of pools it creates.
Recommendations
If you are simply trying to pool connections, I would recommend using a library like pooler or poolboy.
If you have a finite number of specifically named connections that you want to reference by name, I would recommend just having a supervisor with a child spec like the following for each connection:
{my_client1,
{sample_esme, start_link, [my_client1]},
permanent, 5000, worker, [sample_esme]}

Register process under variable username in erlang

I'm trying to write a function in Erlang that will wait in a recieve loop and then spawn other processes. It needs to be able to create processes with a given username. Also, if not given a username, it needs to name them "Anonymous1, Anonymous2, .... etc."
Here is what I have so far:
-module(masterNode).
%% ====================================================================
%% API functions
%% ====================================================================
-export([listen/0]).
%% ====================================================================
%% Internal functions
%% ====================================================================
listen() ->
receive
{UserNodeName, createNode} ->
Pid = spawn(userNode, listen, []),
register(UserNodeName, Pid),
io:format("User Node Created!~n"),
listen();
{createNode} ->
Pid = spawn(userNode, listen, []),
register(anonymous, Pid),
io:format("Anonymous User Node Created!~n"),
listen();
_ ->
io:format("Invalid syntax!.~n")
end.
I'm running into two problems:
I'm not sure how to register a process using a variable username that the user will provide. Since this name could be different each time, it has to be a variable they pass it, but the register() function requires an atom to be the name.
I've found a way to pattern match and create an anonymous user but I'm not sure how to increment the name each time. Right now it's hard coded to the atom "anonymous". Seems like in most languages you could create a global variable and increment it then concatenate it onto the name. But I'm not sure I can do that here.
Any advice on these two problems?
You should also ask you the question: why do I have to register those process. Registering a process with the standard library requires an atom as name, and it is not a good idea to create atoms during run time since their number are limited, and they are not garbage collected. Maybe you could store the processes Pid in a server (in ets,list,mnesia...) with the identification of each user. the server will be in charge to "register" new users, delete entries when the process dies, give back the pid on some get_user_pid(User) request.
If you really need to register or simply avoid to develop this piece of code, the library gproc will do the job for you.
For problem one: you can use list_to_atom or other function to convert UserNodeName to an atom. Please read this link:
http://www.erlang.org/doc/man/erlang.html#list_to_atom-1
For problem two: I think you can use macro in erlang:
just like this:
`-define(NODENAME, anonymous). register(?NODENAME, Pid),'
You can read this link:
http://www1.erlang.org/documentation/doc-4.8.2/doc/extensions/macros.html
Just ask for a string and use list_to_atom/1.
1> Input="user provided string".
"user provided string"
2> list_to_atom(Input).
'user provided string'
You need to preserve a counter in your 'state'. You may add a variable in listen(), or an entry in the process dictionary (e.g. I = get(anonymous_counter), put(anonymous_counter,I+1)) etc. You can then convert this number to a string (e.g. integer_to_list/1), append it to the string "anonymous" and convert to an atom, as before.

Pass some arguments to supervisor init function when app starts

I want to pass some arguments to supervisor:init/1 function and it is desirable, that the application's interface was so:
redis_pool:start() % start all instances
redis_pool:start(Names) % start only given instances
Here is the application:
-module(redis_pool).
-behaviour(application).
...
start() -> % start without params
application:ensure_started(?APP_NAME, transient).
start(Names) -> % start with some params
% I want to pass Names to supervisor init function
% in order to do that I have to bypass application:ensure_started
% which is not GOOD :(
application:load(?APP_NAME),
case start(normal, [Names]) of
{ok, _Pid} -> ok;
{error, {already_started, _Pid}} -> ok
end.
start(_StartType, StartArgs) ->
redis_pool_sup:start_link(StartArgs).
Here is the supervisor:
init([]) ->
{ok, Config} = get_config(),
Names = proplists:get_keys(Config),
init([Names]);
init([Names]) ->
{ok, Config} = get_config(),
PoolSpecs = lists:map(fun(Name) ->
PoolName = pool_utils:name_for(Name),
{[Host, Port, Db], PoolSize} = proplists:get_value(Name, Config),
PoolArgs = [{name, {local, PoolName}},
{worker_module, eredis},
{size, PoolSize},
{max_overflow, 0}],
poolboy:child_spec(PoolName, PoolArgs, [Host, Port, Db])
end, Names),
{ok, {{one_for_one, 10000, 1}, PoolSpecs}}.
As you can see, current implementation is ugly and may be buggy. The question is how I can pass some arguments and start application and supervisor (with params who were given to start/1) ?
One option is to start application and run redis pools in two separate phases.
redis_pool:start(),
redis_pool:run([] | Names).
But what if I want to run supervisor children (redis pool) when my app starts?
Thank you.
The application callback Module:start/2 is not an API to call in order to start the application. It is called when the application is started by application:start/1,2. This means that overloading it to provide differing parameters is probably the wrong thing to do.
In particular, application:start will be called directly if someone adds your application as a dependency of theirs (in the foo.app file). At this point, they have no control over the parameters, since they come from your .app file, in the {mod, {Mod, Args}} term.
Some possible solutions:
Application Configuration File
Require that the parameters be in the application configuration file; you can retrieve them with application:get_env/2,3.
Don't start a supervisor
This means one of two things: becoming a library application (removing the {mod, Mod} term from your .app file) -- you don't need an application behaviour; or starting a dummy supervisor that does nothing.
Then, when someone wants to use your library, they can call an API to create the pool supervisor, and graft it into their supervision tree. This is what poolboy does with poolboy:child_spec.
Or, your application-level supervisor can be a normal supervisor, with no children by default, and you can provide an API to start children of that, via supervisor:start_child. This is (more or less) what cowboy does.
You can pass arguments in the AppDescr argument to application:load/1 (though its a mighty big tuple already...) as {mod, {Module, StartArgs}} according to the docs ("according to the docs" as in, I don't recall doing it this way myself, ever: http://www.erlang.org/doc/apps/kernel/application.html#load-1).
application:load({application, some_app, {mod, {Module, [Stuff]}}})
Without knowing anything about the internals of the application you're starting, its hard to say which way is best, but a common way to do this is to start up the application and then send it a message containing the data you want it to know.
You could make receipt of the message form tell the application to go through a configuration assertion procedure, so that the same message you send on startup is also the same sort of thing you would send it to reconfigure it on the fly. I find this more useful than one-shotting arguments on startup.
In any case, it is usually better to think in terms of starting something, then asking it to do something for you, than to try telling it everything in init parameters. This can be as simple as having it start up and wait for some message that will tell the listener to then spin up the supervisor the way you're trying to here -- isolated one step from the application inclusion issues RL mentioned in his answer.

What kind of types can be sent on an Erlang message?

Mainly I want to know if I can send a function in a message in a distributed Erlang setup.
On Machine 1:
F1 = Fun()-> hey end,
gen_server:call(on_other_machine,F1)
On Machine 2:
handler_call(Function,From,State) ->
{reply,Function(),State)
Does it make sense?
Here's an interesting article about "passing fun's to other Erlang nodes". To resume it briefly:
[...] As you might know, Erlang distribution
works by sending the binary encoding
of terms; and so sending a fun is also
essentially done by encoding it using
erlang:term_to_binary/1; passing the
resulting binary to another node, and
then decoding it again using
erlang:binary_to_term/1.[...]
This is pretty obvious
for most data types; but how does it
work for function objects?
When you encode a fun, what is encoded
is just a reference to the function,
not the function implementation.
[...]
[...]the definition of the function is not passed along; just exactly enough information to recreate the fun at an other node if the module is there.
[...] If the module containing the fun has not yet been loaded, and the target node is running in interactive mode; then the module is attempted loaded using the regular module loading mechanism (contained in the module error_handler); and then it tries to see if a fun with the given id is available in said module. However, this only happens lazily when you try to apply the function.
[...] If you never attempt to apply the function, then nothing bad happens. The fun can be passed to another node (which has the module/fun in question) and then everybody is happy.
Maybe the target node has a module loaded of said name, but perhaps in a different version; which would then be very likely to have a different MD5 checksum, then you get the error badfun if you try to apply it.
I would suggest you to read the whole article, cause it's extremely interesting.
You can send any valid Erlang term. Although you have to be careful when sending funs. Any fun referencing a function inside a module needs that module to exist on the target node to work:
(first#host)9> rpc:call(second#host, erlang, apply,
[fun io:format/1, ["Hey!~n"]]).
Hey!
ok
(first#host)10> mymodule:func("Hey!~n").
5
(first#host)11> rpc:call(second#host, erlang, apply,
[fun mymodule:func/1, ["Hey!~n"]]).
{badrpc,{'EXIT',{undef,[{mymodule,func,["Hey!~n"]},
{rpc,'-handle_call_call/6-fun-0-',5}]}}}
In this example, io exists on both nodes and it works to send a function from io as a fun. However, mymodule exists only on the first node and the fun generates an undef exception when called on the other node.
As for anonymous functions, it seems they can be sent and work as expected.
t1#localhost:
(t1#localhost)7> register(shell, self()).
true
(t1#localhost)10> A = me, receive Fun when is_function(Fun) -> Fun(A) end.
hello me you
ok
t2#localhost:
(t2#localhost)11> B = you.
you
(t2#localhost)12> Fn2 = fun (A) -> io:format("hello ~p ~p~n", [A, B]) end.
#Fun<erl_eval.6.54118792>
(t2#localhost)13> {shell, 't1#localhost'} ! Fn2.
I am adding coverage logic to an app built on riak-core, and the merge of results gathered can be tricky if anonymous functions cannot be used in messages.
Also check out riak_kv/src/riak_kv_coverage_filter.erl
riak_kv might be using it to filter result, I guess.

Resources