Restricting number of function iterations - erlang

I'm writing a code in Erlang which suppose to generate a random number for a random amount of time and add each number to a list. I managed a function which can generate random numbers and i kinda managed a method for adding it to a list,but my main problem is restricting the number of iterations of the function. I like the function to produce several numbers and add them to the list and then kill that process or something like that.
Here is my code so far:
generator(L1)->
random:seed(now()),
A = random:uniform(100),
L2 = lists:append(L1,A),
generator(L2),
producer(B,L) ->
receive
{last_element} ->
consumer ! {lists:droplast(B)}
end
consumer()->
timer:send_after(random:uniform(1000),producer,{last_element,self()}),
receive
{Answer, Producer_PID} ->
io:format("the last item is:~w~n",[Answer])
end,
consumer().
start() ->
register(consumer,spawn(lis,consumer,[])),
register(producer,spawn(lis,producer,[])),
register(generator,spawn(lis,generator,[random:uniform(10)])).
I know it's a little bit sloppy and incomplete but that's not the case.

First, you should use rand to generate random numbers instead of random, it is an improved module.
In addition, when using rand:uniform/1 you won't need to change the seed every time you run your program. From erlang documentation:
If a process calls uniform/0 or uniform/1 without setting a seed
first, seed/1 is called automatically with the default algorithm and
creates a non-constant seed.
Finally, in order to create a list of random numbers, take a look at How to create a list of 1000 random number in erlang.
If I conclude all this, you can just do:
[rand:uniform(100) || _ <- lists:seq(1, 1000)].

There have some issue in your code:
timer:send_after(random:uniform(1000),producer,{last_element,self()}),, you send {last_element,self()} to producer process, but in producer, you just receive {last_element}, these messages are not matched.
you can change
producer(B,L) ->
receive
{last_element} ->
consumer ! {lists:droplast(B)}
end.
to
producer(B,L) ->
receive
{last_element, FromPid} ->
FromPid! {lists:droplast(B)}
end.
the same reason for consumer ! {lists:droplast(B)} and {Answer, Producer_PID} ->.

Related

In Erlang, passing a message to all elements of a list of pids

I am trying to build a very simple barrier-synchronization server, where the server is initially fed a number of processes that will be communicating with it. When a process is done, it receives a message with that process' Pid, and it keeps a list of every process to do so. When the barrier reaches zero (all processes have sent messages), my server needs to send a message to each of these (I am using [Pid | ProcList] as my list of pids).
I have tried using a helper function to no avail, list comprehensions keep me in an infinite loop, and as such I am looking into how to use lists:foreach to take care of this.
I am fairly new to functional programming, but from what I understand, this foreach needs to take in the list as well as a lambda-calculus function to send a message to each node in the list. Due to the infix nature of "!", I have yet to find a way to do this without causing syntax errors.
How you've made infinite loop in list comprehension? I must say, that's quite challenging. Try this:
Message = % broadcast message goes here
ListOfPids = % list of recipients
[Pid ! Message || Pid <- ListOfPids].
If you want to use foreach, than it takes one argument function as first argument, so need to wrap send first, as it is two argument function.
Message = % broadcast message goes here
ListOfPids = % list of recipients
Fun = fun (Pid) -> Pid ! Message end,
lists:foreach(Fun, ListOfPids).

Extract values from list of tuples continuously in Erlang

I am learning Erlang from a Ruby background and having some difficulty grasping the thought process. The problem I am trying to solve is the following:
I need to make the same request to an api, each time I receive a unique ID in the response which I need to pass into the next request until there is not ID returned. From each response I need to extract certain data and use it for other things as well.
First get the iterator:
ShardIteratorResponse = kinetic:get_shard_iterator(GetShardIteratorPayload).
{ok,[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89D"...>>}]}
Parse out the shard_iterator..
{_, [{_, ShardIterator}]} = ShardIteratorResponse.
Make the request to kinesis for the streams records...
GetRecordsPayload = [{<<"ShardIterator">>, <<ShardIterator/binary>>}].
[{<<"ShardIterator">>,
<<"AAAAAAAAAAGU+v0fDvpmu/02z5Q5OJZhPo/tU7fjftFF/H9M7J9niRJB8MIZiB9E1ntZGL90dIj3TW6MUWMUX67NEj4GO89DETABlwVV"...>>}]
14> RecordsResponse = kinetic:get_records(GetRecordsPayload).
{ok,[{<<"NextShardIterator">>,
<<"AAAAAAAAAAFy3dnTJYkWr3gq0CGo3hkj1t47ccUS10f5nADQXWkBZaJvVgTMcY+nZ9p4AZCdUYVmr3dmygWjcMdugHLQEg6x"...>>},
{<<"Records">>,
[{[{<<"Data">>,<<"Zmlyc3QgcmVjb3JkISEh">>},
{<<"PartitionKey">>,<<"BlanePartitionKey">>},
{<<"SequenceNumber">>,
<<"49545722516689138064543799042897648239478878787235479554">>}]}]}]}
What I am struggling with is how do I write a loop that keeps hitting the kinesis endpoint for that stream until there are no more shard iterators, aka I want all records. Since I can't re-assign the variables as I would in Ruby.
WARNING: My code might be bugged but it's "close". I've never ran it and don't see how last iterator can look like.
I see you are trying to do your job entirely in shell. It's possible but hard. You can use named function and recursion (since release 17.0 it's easier), for example:
F = fun (ShardIteratorPayload) ->
{_, [{_, ShardIterator}]} = kinetic:get_shard_iterator(ShardIteratorPayload),
FunLoop =
fun Loop(<<>>, Accumulator) -> % no clue how last iterator can look like
lists:reverse(Accumulator);
Loop(ShardIterator, Accumulator) ->
{ok, [{_, NextShardIterator}, {<<"Records">>, Records}]} =
kinetic:get_records([{<<"ShardIterator">>, <<ShardIterator/binary>>}]),
Loop(NextShardIterator, [Records | Accumulator])
end,
FunLoop(ShardIterator, [])
end.
AllRecords = F(GetShardIteratorPayload).
But it's too complicated to type in shell...
It's much easier to code it in modules.
A common pattern in erlang is to spawn another process or processes to fetch your data. To keep it simple you can spawn another process by calling spawn or spawn_link but don't bother with links now and use just spawn/3.
Let's compile simple consumer module:
-module(kinetic_simple_consumer).
-export([start/1]).
start(GetShardIteratorPayload) ->
Pid = spawn(kinetic_simple_fetcher, start, [self(), GetShardIteratorPayload]),
consumer_loop(Pid).
consumer_loop(FetcherPid) ->
receive
{FetcherPid, finished} ->
ok;
{FetcherPid, {records, Records}} ->
consume(Records),
consumer_loop(FetcherPid);
UnexpectedMsg ->
io:format("DROPPING:~n~p~n", [UnexpectedMsg]),
consumer_loop(FetcherPid)
end.
consume(Records) ->
io:format("RECEIVED:~n~p~n",[Records]).
And fetcher:
-module(kinetic_simple_fetcher).
-export([start/2]).
start(ConsumerPid, GetShardIteratorPayload) ->
{ok, [ShardIterator]} = kinetic:get_shard_iterator(GetShardIteratorPayload),
fetcher_loop(ConsumerPid, ShardIterator).
fetcher_loop(ConsumerPid, {_, <<>>}) -> % no clue how last iterator can look like
ConsumerPid ! {self(), finished};
fetcher_loop(ConsumerPid, ShardIterator) ->
{ok, [NextShardIterator, {<<"Records">>, Records}]} =
kinetic:get_records(shard_iterator(ShardIterator)),
ConsumerPid ! {self(), {records, Records}},
fetcher_loop(ConsumerPid, NextShardIterator).
shard_iterator({_, ShardIterator}) ->
[{<<"ShardIterator">>, <<ShardIterator/binary>>}].
As you can see both processes can do their job concurrently.
Try from your shell:
kinetic_simple_consumer:start(GetShardIteratorPayload).
Now your see that your shell process turns to consumer and you will have your shell back after fetcher will send {ItsPid, finished}.
Next time instead of
kinetic_simple_consumer:start(GetShardIteratorPayload).
run:
spawn(kinetic_simple_consumer, start, [GetShardIteratorPayload]).
You should play with spawning processes - it's erlang main strength.
In Erlang, you can write loop using tail recursive functions. I don't know the kinetic API, so for simplicity, I just assume, that kinetic:next_iterator/1 return {ok, NextIterator} or {error, Reason} when there are no more shards.
loop({error, Reason}) ->
ok;
loop({ok, Iterator}) ->
do_something_with(Iterator),
Result = kinetic:next_iterator(Iterator),
loop(Result).
You are replacing loop with iteration. First clause deals with case, where there are no more shards left (always start recursion with the end condition). Second clause deals with case, where we got some iterator, we do something with it and call next.
The recursive call is last instruction in the function body, which is called tail recursion. Erlang optimizes such calls - they don't use call stack, so they can run infinitely in constant memory (you will not get anything like "Stack level too deep")

How to save state in an Erlang process?

I am learning Erlang and trying to figure out how I can, and should, save state inside a process.
For example, I am trying to write a program that given a list of numbers in a file, tells me whether a number appears in that file. My approach is to uses two processes
cache which reads the content of the file into a set, then waits for numbers to check, and then replies whether they appear in the set.
is_member_loop(Data_file) ->
Numbers = read_numbers(Data_file),
receive
{From, Number} ->
From ! {self(), lists:member(Number, Numbers)},
is_member_loop(Data_file)
end.
client which sends numbers to cache and waits for the true or false response.
check_number(Number) ->
NumbersPid ! {self(), Number},
receive
{NumbersPid, Is_member} ->
Is_member
end.
This approach is obviously naive since the file is read for every request. However, I am quite new at Erlang and it is unclear to me what would be the preferred way of keeping state between different requests.
Should I be using the process dictionary? Is there a different mechanism I am not aware of for that sort of process state?
Update
The most obvious solution, as suggested by user601836, is to pass the set of numbers as a param to is_member_loop instead of the filename. It seems to be a common idiom in Erlang and there is a good example in the fantastic online book Learn you some Erlang.
I think, however, that the question still holds for more complex state that I'd want to preserve in my process.
Simple solution, you can pass to your function is_member_loop(Data_file) the list of numbers rather then the file name.
The best solution when you deal with a state consists in using a gen_server. To learn more you should take a look at records and gen_server behaviour (this may also be useful).
In practice:
1) start with a module (yourmodule.erl) based on gen_server behaviour
2) read your file in the init function of the gen_server and pass it as state field:
init([]) ->
Numbers = read_numbers(Data_file),
{ok, #state{numbers=Numbers}}.
3) write a function which will be used to trigger a call to the gen_server
check_number(Number) ->
gen_server:call(?MODULE, {check_number, Number}).
4) write the code in order to handle messages triggered from your function
handle_call({check_number, Number}, _From, #state{numbers=Numbers} = State) ->
Reply = lists:member(Number, Numbers)},
{reply, Reply, State};
handle_call(_Request, _From, State) ->
Reply = ok,
{reply, Reply, State}.
5) export from yourmodule.erl function check_number
-export([check_number/1]).
Two things to be explained about point 4:
a) we extract values inside the record State using pattern matching
b) As you may see I left the generic handle call, otherwise your gen_server will fail due to wrong pattern matching whenever a message different from {check_number, Number} is received
Note: if you are new to erlang, don't use process dictionary
Not sure how idiomatic this is, since I'm not exactly an Erlang pro yet, but I'd handle this by using ETS. Basically,
read_numbers_to_ets(DataFile) ->
Table = ets:new(numbers, [ordered_set]),
insert_numbers(Table, DataFile),
Table.
insert_numbers(Table, DataFile) ->
case read_next_number(DataFile) of
eof -> ok;
Num -> ets:insert(numbers, {Num})
end.
you could then define your is_member as
is_member(TableId, Number) ->
case ets:match(TableId, {Number}) of
[] -> false; %% no match from ets
[[]] -> true %% ets found the number you're looking for in that table
end.
Instead of taking a Data_file, your is_member_loop would take the id of the table to do a lookup on.

Spawn many processes erlang

I wanna measure the performance to my database by measuring the time taken to do something as the number of processes increase. The intention is to plot a graph of performance vs number of processes after, anyone has an idea how? i am a beginner in elrlang please helo
Assuming your database is mnesia, this should not be hard. one way would be to have a write function and a read function. However, note that there are several Activity access contexts with mnesia. To test write times, you should NOT use the context of transaction because it returns immediately to the calling process, even before a disc write has occured. However, for disc writes, its important that you look at the context called: sync_transaction. Here is an example:
write(Record)->
Fun = fun(R)-> mnesia:write(R) end,
mnesia:activity(sync_transaction,Fun,[Record],mnesia_frag).
The function above will return only when all active replicas of the mnesia table have committed the record onto the data disc file. Hence to test the speed as processes increase, you need to have a record generator,a a process spawner , the write function and finally a timing mechanism. For timing, we have a built in function called: timer:tc/1, timer:tc/2 and timer:tc/3 which returns the exact time it took to execute (completely) a given function. To cut the story short, this is how i would do this:
-module(stress_test).
-compile(export_all).
-define(LIMIT,10000).
-record(book,{
isbn,
title,
price,
version}).
%% ensure this table is {type,bag}
-record(write_time,{
isbn,
num_of_processes,
write_time
}).
%% Assuming table (book) already exists
%% Assuming mnesia running already
start()->
ensure_gproc(),
tv:start(),
spawn_many(?LIMIT).
spawn_many(0)-> ok;
spawn_many(N)->
spawn(?MODULE,process,[]),
spawn_many(N - 1).
process()->
gproc:reg({n, l,guid()},ignored),
timer:apply_interval(timer:seconds(2),?MODULE,write,[]),
receive
<<"stop">> -> exit(normal)
end.
total_processes()->
proplists:get_value(size,ets:info(gproc)) div 3.
ensure_gproc()->
case lists:keymember(gproc,1,application:which_applications()) of
true -> ok;
false -> application:start(gproc)
end.
guid()->
random:seed(now()),
MD5 = erlang:md5(term_to_binary([random:uniform(152629977),{node(), now(), make_ref()}])),
MD5List = lists:nthtail(3, binary_to_list(MD5)),
F = fun(N) -> f("~2.16.0B", [N]) end,
L = [F(N) || N <- MD5List],
lists:flatten(L).
generate_record()->
#book{isbn = guid(),title = guid(),price = guid()}.
write()->
Record = generate_record(),
Fun = fun(R)-> ok = mnesia:write(R),ok end,
%% Here is now the actual write we measure
{Time,ok} = timer:tc(mnesia,activity,[sync_transaction,Fun,[Record],mnesia_frag]),
%% The we save that time, the number of processes
%% at that instant
NoteTime = #write_time{
isbn = Record#book.isbn,
num_of_processes = total_processes(),
write_time = Time
},
mnesia:activity(transaction,Fun,[NoteTime],mnesia_frag).
Now there are dependencies here, especially: gproc download and build it into your erlang lib path from here Download Gproc.To run this, just call: stress_test:start(). The table write_time will help you draw a graph of number of processes against time taken to write. As the number of processes increase from 0 to the upper limit (?LIMIT), we note the time taken to write a given record at the given instant and we also note the number of processes at that time.UPDATE
f(S)-> f(S,[]).
f(S,Args) -> lists:flatten(io_lib:format(S, Args)).
That is the missing function. Apologies.... Remember to study the table write_time, using the application tv, a window is opened in which you can examine the mnesia tables. Use this table to see increasing write times/ or decreasing performance as number of processes increase from time to time. An element i have left out is to note the actual time of the write action using time() which may be important parameter. You may add it in the table definition of the write_time table.
Also look at http://wiki.basho.com/Benchmarking.html
you might look at tsung http://tsung.erlang-projects.org/

Setting a limit to a queue size

How does one set a queue to hold N values. When the N is reached, remove the last item and add a value to the front of the queue.
Should this be done with if statement?
I also want to calculate the values within the queue as a new item is added. e.g. add all of the values in the queue.
I assume from your query that you both want to maximize the length of the queue and get the sum of all the values.
To answer your easiest question first: Erlang queues, however you wish to represent them, are normal Erlang data structures so there are no problems in storing them in a dictionary.
The OTP queue module is actually very simple but the plethora of interfaces easily makes it confusing to use. #Nathon's enqueue function can be made much more efficient by not using the queue data structure directly but by defining your own data structure which includes the queue and its current length, {Length,Queue}. If the sum is important then you could even include it as well.
The queue representations are very simple so it is very easy to write your own specialised form of it.
The simplest way is to keep the queue in a list and take elements from the head and add new elements to the end. So :
new(Max) when is_integer(Max), Max > 0 -> {0,Max,[]}. %Length, Max and Queue list
take({L,M,[H|T]}) -> {H,{L-1,M,T}}.
add(E, {L,M,Q}) when L < M ->
{L+1,M,Q ++ [E]}; %Add element to end of list
add(E, {M,M,[H|T]}) -
{M,M,T ++ [E]}. %Add element to end of list
When the queue becomes full the oldest member, which is at the front of the queue, is dropped. An empty queue generates an error. This is a very simple structure but it is inefficient as the queue is copied every time a new element is added. Reversing the list does not help as then the list is copied every time an element is removed from it. But it is simple, and it does work.
A much more efficient structure is to split the queue into two lists, the front end of the queue and the rear end of the queue. The rear end is reversed and becomes the new front when the front is empty. So:
new(Max) when is_integer(Max), Max > 0 ->
{0,Max,[],[]}. %Length, Max, Rear and Front
take({L,M,R,[H|T]}) -> {H,{L-1,M,R,T}};
take{{L,M,R,[]}) when L > 0 ->
take({L,M,[],lists:reverse(R)}). %Move the rear to the front
add(E, {L,M,R,F}) when L < M ->
{L+1,M,[R|E],F}; %Add element to rear
add(E, {M,M,R,[H|T]}) ->
{M,M,[R|E],T}; %Add element to rear
add(E, {M,M,R,[]}) ->
add(E, {M,M,[],lists:reverse(R)}). %Move the rear to the front
Again when the queue becomes full the oldest member, which is at the front of the queue, is dropped and an empty queue generates an error. This is the data structure used in the queue module.
It would be very easy to add the current sum of the elements to the structure and manage it directly.
Often, when working on simple data structures like this, it is just as easy to roll your own module as it is to use a provided one.
Given the comments, this will do it:
enqueue(Value, Queue) ->
Pushed = queue:in(Value, Queue),
Sum = fun (Q) -> lists:sum(queue:to_list(Q)) end,
case queue:len(Pushed) of
Len when Len > 10 ->
Popped = queue:drop(Pushed),
{Popped, Sum(Popped)};
_ ->
{Pushed, Sum(Pushed)}
end.
If you don't actually want to sum the items, you can use lists:foldl instead, or just write a function to do the operation directly on a queue.

Resources