Mapreduce with Riak - erlang

Does anyone have example code for mapreduce for Riak that can be run on a single Riak node.

cd ~/riak
erl -name zed#127.0.0.1 -setcookie riak -pa apps/riak/ebin
In the shell:
# connect to the server
> {ok, Client} = riak:client_connect('riak#127.0.0.1').
{ok,{riak_client,'riak#127.0.0.1',<<6,201,208,64>>}}
# create and insert objects
> Client:put(riak_object:new(<<"groceries">>, <<"mine">>, ["eggs", "bacons"]), 1).
ok
> Client:put(riak_object:new(<<"groceries">>, <<"yours">>, ["eggs", "sausages"]), 1).
ok
# create Map and Reduce functions
> Count = fun(G, 'undefined', 'none') ->
[dict:from_list([{I, 1} || I <- riak_object:get_value(G)])]
end.
#Fun<erl_eval.18.105910772>
> Merge = fun(Gcounts, 'none') ->
[lists:foldl(fun(G, Acc) ->
dict:merge(fun(_, X, Y) -> X+Y end, G, Acc)
end, dict:new(), Gcounts)]
end.
#Fun<erl_eval.12.113037538>
# do the map-reduce
> {ok, [R]} = Client:mapred([{<<"groceries">>, <<"mine">>},
{<<"groceries">>, <<"yours">>}],
[{'map', {'qfun', Count}, 'none', false},
{'reduce', {'qfun', Merge}, 'none', true}]).
{ok,[{dict,...
> dict:to_list(R).
[{"eggs",2},{"susages",1},{"bacons",1}]
For the server I used absolutely default config:
$ hg clone http://hg.basho.com/riak/
$ cd riak
$ ./rebar compile generate
$ cd rel
$ ./riak/bin/riak start

Here's an example of how to do a MapReduce using JavaScript functions.

Related

Change Erlang file handle limit?

I'm running into trouble with an Erlang OTP + Cowboy app that does not allow me to open enough files simultaneously.
How do I change the number of open file handles allowed in the BEAM?
Potentially, I'll need about 500 small text files open at the same time, but it appears that the file limit is 224. I've got the value of 224 from this little test program:
-module(test_fds).
-export([count/0]).
count() -> count(1, []).
count(N, Fds) ->
case file:open(integer_to_list(N), [write]) of
{ok, F} ->
count(N+1, [F| Fds]);
{error, Err} ->
[ file:close(F) || F <- Fds ],
delete(N-1),
{Err, N}
end.
delete(0) -> ok;
delete(N) ->
case file:delete(integer_to_list(N)) of
ok -> ok;
{error, _} -> meh
end,
delete(N-1).
This gives
$ erl
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [kernel-poll:false]
Eshell V9.2 (abort with ^G)
1> c(test_fds).
{ok,test_fds}
2> test_fds:count().
{emfile,224}
3>
This seems to be an Erlang problem rather than a Mac OSX problem since from the command line, I get:
$ sysctl -h kern.maxfiles
kern.maxfiles: 49,152
$ sysctl -h kern.maxfilesperproc
kern.maxfilesperproc: 24,576
The number of open file descriptors is most likely being limited by your shell. You can increase it by running ulimit -n 1000 (or more) in your shell before invoking erl. On my system, the default value was 7168 and your script could open 7135 files before returning emfile. Here's the output of me running your script with different ulimit values:
$ ulimit -n 64; erl -noshell -eval 'io:format("~p~n", [test_fds:count()]), erlang:halt()'
{emfile,32}
$ ulimit -n 128; erl -noshell -eval 'io:format("~p~n", [test_fds:count()]), erlang:halt()'
{emfile,96}
$ ulimit -n 256; erl -noshell -eval 'io:format("~p~n", [test_fds:count()]), erlang:halt()'
{emfile,224}
$ ulimit -n 512; erl -noshell -eval 'io:format("~p~n", [test_fds:count()]), erlang:halt()'
{emfile,480}
erl is most likely opening 32 file descriptors before it starts evaluating our code which explains the constant difference of 32 in the ulimit and the output.

Communication between 2 Mac computers with Erlang

I followed the book "Programming Erlang — Joe Armstrong" to try to build the communication between 2 Mac computers with Erlang (Chap 14):
% file: kvs.erl
-module(kvs).
-export([start/0, store/2, lookup/1]).
start() -> register(kvs, spawn(fun() -> loop() end)).
store(Key, Value) -> rpc({store, Key, Value}).
lookup(Key) -> rpc({lookup, Key}).
rpc(Q) ->
kvs ! {self(), Q},
receive
{kvs, Reply} ->
Reply
end.
loop() ->
receive
{From, {store, Key, Value}} ->
put(Key, {ok, Value}),
From ! {kvs, true},
loop();
{From, {lookup, Key}} ->
From ! {kvs, get(Key)},
loop()
end.
Set up Mac 1 (Mac Pro) and run a Erlang server:
$ sudo hostname this.is.macpro.com
$ hostname
this.is.macpro.com
$ ipconfig getifaddr en2
aaa.bbb.ccc.209
$ erl -name server -setcookie abcxyz
(server#this.is.macpro.com)> c("kvs.erl").
{ok,kvs}
(server#this.is.macpro.com)> kvs:start().
true
(server#this.is.macpro.com)> kvs:store(hello, world).
true
(server#this.is.macpro.com)> kvs:lookup(hello).
{ok,world}
I tried using both IP and hostname to make a RPC from another Mac but get {badrpc, nodedown}.
Set up Mac 2 (MacBook Pro) and try to call Mac 1:
$ sudo hostname this.is.macbookpro.com
$ hostname
this.is.macbookpro.com
$ ipconfig getifaddr en2
aaa.bbb.ccc.211 # different IP
$ erl -name client -setcookie abcxyz
% try using the hostname of Mac 1 but failed
(client#this.is.macbookpro.com)> rpc:call('server#this.is.macpro.com', kvs, lookup, [hello]).
{badrpc, nodedown}
% try using the IP address of Mac 1 but failed
(client#this.is.macbookpro.com)> rpc:call('server#aaa.bbb.ccc.209', kvs, lookup, [hello]).
{badrpc, nodedown}
How to set up my Mac computers and make them available for RCP with Erlang?
When using th -name, you should provide the full name. The syntax you are using is for -sname. Try this:
erl -name server#this.is.macpro.com -setcookie "abcxyz"
erl -name client#this.is.macbookpro.com -setcookie "abcxyz"
You can also specify the an IP after the # in both cases.
Then from one node, connect to the other node:
net_kernel:connect_node('client#this.is.macbookpro.com').
This should return true. If it returns false then you are not connected. You can very with nodes()..
(joe#teves-MacBook-Pro.local)3> net_kernel:connect_node('steve#Steves-MacBook-Pro.local').
true
(joe#teves-MacBook-Pro.local)4> nodes().
['steve#Steves-MacBook-Pro.local']
If this does not fix it, then you can check epmd on both systems to ensure they are registered.
epmd -names

How can I pass command-line arguments to a Erlang program?

I'm working on a Erlang. How can I pass command line parameters to it?
Program File-
-module(program).
-export([main/0]).
main() ->
io:fwrite("Hello, world!\n").
Compilation Command:
erlc Program.erl
Execution Command-
erl -noshell -s program main -s init stop
I need to pass arguments through execution command and want to access them inside main written in program's main.
$ cat program.erl
-module(program).
-export([main/1]).
main(Args) ->
io:format("Args: ~p\n", [Args]).
$ erlc program.erl
$ erl -noshell -s program main foo bar -s init stop
Args: [foo,bar]
$ erl -noshell -run program main foo bar -s init stop
Args: ["foo","bar"]
It is documented in erl man page.
I would recommend using escript for this purpose because it has a simpler invocation.
These are not really commandline-parameters, but if you want to use environment-variables, the os-module might help. os:getenv() gives you a list of all environment variables. os:getenv(Var) gives you the value of the variable as a string, or returns false if Var is not an environment-variable.
These env-variables should be set before you start the application.
I always use an idiom like this to start (on a bash-shell):
export PORT=8080 && erl -noshell -s program main
If you want "named" argument, with possible default values, you can use this command line (from a toy appli I made):
erl -pa "./ebin" -s lavie -noshell -detach -width 100 -height 80 -zoom 6
lavie:start does nothing more than starting an erlang application:
-module (lavie).
-export ([start/0]).
start() -> application:start(lavie).
which in turn start the application where I defined default value for parameters, here is the app.src (rebar build):
{application, lavie,
[
{description, "Le jeu de la vie selon Conway"},
{vsn, "1.3.0"},
{registered, [lavie_sup,lavie_wx,lavie_fsm,lavie_server,rule_wx]},
{applications, [
kernel,
stdlib
]},
{mod, { lavie_app, [200,50,2]}}, %% with default parameters
{env, []}
]}.
then, in the application code, you can use init:get_argument/1 to get the value associated to each option if it was defined in the command line.
-module(lavie_app).
-behaviour(application).
%% Application callbacks
-export([start/2, stop/1]).
%% ===================================================================
%% Application callbacks
%% ===================================================================
start(_StartType, [W1,H1,Z1]) ->
W = get(width,W1),
H = get(height,H1),
Z = get(zoom,Z1),
lavie_sup:start_link([W,H,Z]).
stop(_State) ->
% init:stop().
ok.
get(Name,Def) ->
case init:get_argument(Name) of
{ok,[[L]]} -> list_to_integer(L);
_ -> Def
end.
Definitively more complex than #Hynek proposal, but it gives you more flexibility, and I find the command line less opaque.

Compiling LFE files with make

Is there a standard way of compiling .lfe source files from a make rule in an OTP project?
According to the docs, I'm supposed to use lfe_comp:file/1, which doesn't help much if I want to compile multiple such files in an OTP application (where I'm supposed to keep the source files in src, but the binaries in ebin).
Ideally, I'd be able to do something like
erlc -Wf -o ebin src/*lfe
But there doesn't seem to be lfe support in erlc. The best solution I can think of off the top of my head is
find src/*lfe -exec erl -s lfe_comp file {} -s init stop \;
mv src/*beam ebin/
but that seems inelegant. Any better ideas?
On suggestion from rvirding, here's a first stab at lfec that does what I want above (and pretty much nothing else). I'd invoke it from a Makefile with ./lfec -o ebin src/*lfe.
#!/usr/bin/env escript
%% -*- erlang -*-
%%! -smp enable -sname lfec -mnesia debug verbose
main(Arguments) ->
try
{Opts, Args} = parse_opts(Arguments),
case get_opt("-o", Opts) of
false ->
lists:map(fun lfe_comp:file/1, Args);
Path ->
lists:map(fun (Arg) -> lfe_comp:file(Arg, [{outdir, Path}]) end,
Args)
end
catch
_:_ -> usage()
end;
main(_) -> usage().
get_opt(Target, Opts) ->
case lists:keyfind(Target, 1, Opts) of
false -> false;
{_} -> true;
{_, Setting} -> Setting
end.
parse_opts(Args) -> parse_opts(Args, []).
parse_opts(["-o", TargetDir | Rest], Opts) ->
parse_opts(Rest, [{"-o", TargetDir} | Opts]);
parse_opts(Args, Opts) -> {Opts, Args}.
usage() ->
io:format("usage:\n"),
io:format("-o [TargetDir] -- output files to specified directory\n"),
halt(1).
Not really. LFE is not supported by OTP so erlc does not know about .lfe files. And as far as I know there is no way to "open up" erlc and dynamically add information how to process files. An alternative would be to write an lfec script for this. I will think about it.
Just as a matter of interest what are using LFE for?

mnesia save out info

how to save mnesia:info() output?
I use remote sh in unix screen and can't to scroll window
Here's a function that you can put in the user_default.erl module on the remote node:
out(Fun, File) ->
G = erlang:group_leader(),
{ok, FD} = file:open(File, [write]),
erlang:group_leader(FD, self()),
Fun(),
erlang:group_leader(G, self()),
file:close(FD).
Then, you can do the following (after recompiling and loading user_default):
1> out(fun () -> mnesia:info() end, "mnesia_info.txt").
Or, just cut-and paste the following into the shell:
F = fun (Fun, File) ->
G = erlang:group_leader(),
{ok, FD} = file:open(File, [write]),
erlang:group_leader(FD, self()),
Fun(),
erlang:group_leader(G, self()),
file:close(FD)
end,
F(fun () -> mnesia:info() end, "mnesia_info.txt").
In cases where you are situated at a terminal without scrolling (if you are on a xterm and see no scrollbar simply switch it on) a tool very useful is screen: it provides virtual vt100 termials, you can switch between terminals even detach from it and come back later (nice for long running programs on remote serversthat need the occasional interaction).
And you can log transcripts to a file and scroll in the output of the virtual terminal.
If you are on a Unix like System you will probably be able to just install a pre-built package, if all else fails you can always pick up the source and build it yourself.
Also look at this article for other solutions.
If you are not able to install screen on the system, a simple but not very comfortable hack that only uses Unix built-in stuff is:
Start erlang shell with tee(1) to redirect the output:
$ erl | tee output.log
Eshell V5.7.5 (abort with ^G)
1> mnesia:info().
===> System info in version {mnesia_not_loaded,nonode#nohost,
{1301,742014,571300}}, debug level = none <===
opt_disc. Directory "/usr/home/peer/Mnesia.nonode#nohost" is NOT used.
use fallback at restart = false
running db nodes = []
stopped db nodes = [nonode#nohost]
ok
2>
Its a bit hard to get out of the shell (you probably have to type ^D to end the input file) but then you have the tty output in the file:
$ cat output.log
Eshell V5.7.5 (abort with ^G)
1> ===> System info in version {mnesia_not_loaded,nonode#nohost,
{1301,742335,572797}}, debug level = none <===
...
I believe you cant. See system_info(all).
Convert to a string:
S = io_lib:format("~p~n", [mnesia:info()]).
Then write it to disk.

Resources