Spawn many processes erlang - erlang

I wanna measure the performance to my database by measuring the time taken to do something as the number of processes increase. The intention is to plot a graph of performance vs number of processes after, anyone has an idea how? i am a beginner in elrlang please helo

Assuming your database is mnesia, this should not be hard. one way would be to have a write function and a read function. However, note that there are several Activity access contexts with mnesia. To test write times, you should NOT use the context of transaction because it returns immediately to the calling process, even before a disc write has occured. However, for disc writes, its important that you look at the context called: sync_transaction. Here is an example:
write(Record)->
Fun = fun(R)-> mnesia:write(R) end,
mnesia:activity(sync_transaction,Fun,[Record],mnesia_frag).
The function above will return only when all active replicas of the mnesia table have committed the record onto the data disc file. Hence to test the speed as processes increase, you need to have a record generator,a a process spawner , the write function and finally a timing mechanism. For timing, we have a built in function called: timer:tc/1, timer:tc/2 and timer:tc/3 which returns the exact time it took to execute (completely) a given function. To cut the story short, this is how i would do this:
-module(stress_test).
-compile(export_all).
-define(LIMIT,10000).
-record(book,{
isbn,
title,
price,
version}).
%% ensure this table is {type,bag}
-record(write_time,{
isbn,
num_of_processes,
write_time
}).
%% Assuming table (book) already exists
%% Assuming mnesia running already
start()->
ensure_gproc(),
tv:start(),
spawn_many(?LIMIT).
spawn_many(0)-> ok;
spawn_many(N)->
spawn(?MODULE,process,[]),
spawn_many(N - 1).
process()->
gproc:reg({n, l,guid()},ignored),
timer:apply_interval(timer:seconds(2),?MODULE,write,[]),
receive
<<"stop">> -> exit(normal)
end.
total_processes()->
proplists:get_value(size,ets:info(gproc)) div 3.
ensure_gproc()->
case lists:keymember(gproc,1,application:which_applications()) of
true -> ok;
false -> application:start(gproc)
end.
guid()->
random:seed(now()),
MD5 = erlang:md5(term_to_binary([random:uniform(152629977),{node(), now(), make_ref()}])),
MD5List = lists:nthtail(3, binary_to_list(MD5)),
F = fun(N) -> f("~2.16.0B", [N]) end,
L = [F(N) || N <- MD5List],
lists:flatten(L).
generate_record()->
#book{isbn = guid(),title = guid(),price = guid()}.
write()->
Record = generate_record(),
Fun = fun(R)-> ok = mnesia:write(R),ok end,
%% Here is now the actual write we measure
{Time,ok} = timer:tc(mnesia,activity,[sync_transaction,Fun,[Record],mnesia_frag]),
%% The we save that time, the number of processes
%% at that instant
NoteTime = #write_time{
isbn = Record#book.isbn,
num_of_processes = total_processes(),
write_time = Time
},
mnesia:activity(transaction,Fun,[NoteTime],mnesia_frag).
Now there are dependencies here, especially: gproc download and build it into your erlang lib path from here Download Gproc.To run this, just call: stress_test:start(). The table write_time will help you draw a graph of number of processes against time taken to write. As the number of processes increase from 0 to the upper limit (?LIMIT), we note the time taken to write a given record at the given instant and we also note the number of processes at that time.UPDATE
f(S)-> f(S,[]).
f(S,Args) -> lists:flatten(io_lib:format(S, Args)).
That is the missing function. Apologies.... Remember to study the table write_time, using the application tv, a window is opened in which you can examine the mnesia tables. Use this table to see increasing write times/ or decreasing performance as number of processes increase from time to time. An element i have left out is to note the actual time of the write action using time() which may be important parameter. You may add it in the table definition of the write_time table.

Also look at http://wiki.basho.com/Benchmarking.html

you might look at tsung http://tsung.erlang-projects.org/

Related

How can I split a list of strings by chunks when they have some light markdown attributes, in F#?

I have a tool that is using a Telegram chatbot to interact with its users.
Telegram limits the call rate, so I use a queue system that gets flushed at regular intervals.
So, the current code is very basic:
// flush the message queue
let flushMessageQueue() =
if not messageQueue.IsEmpty then <- messageQueue is a ConcurrentQueue
// get all the messages
let messages =
messageQueue
|> Seq.unfold(fun q ->
match q.TryDequeue () with
| true, m -> Some (m, q)
| _ -> None)
// put all the messages in a single string
let messagesString = String.Join("\n", messages)
// send the data
client.SendTextMessageAsync(chatId, messagesString, ParseMode.Markdown)
|> Async.AwaitTask
|> Async.RunSynchronously
|> ignore
this is called at regular interval, while the write is:
// broadcast message
let broadcastMessage message =
messageQueue.Enqueue(message)
printfn "%s" (message.Replace ("```", String.Empty))
But as messages became more complex, two problems came at once:
Part of the output is formatted text with simple markdown:
Some blocks of lines are wrapped between ``` sections
There are some ``` sections as well inside some lines
The text is UTF-8 and uses a bunch of symbols
Some example of text may be:
```
this is a group of lines
with one, or many many lines
```
and sometimes there are things ```like this``` as well
And... I found out that Telegram limits message size to 4kb as well
So, I thought of two things:
I can maintain a state with the open / close ``` and pull from a queue, wrap each line in triple back ticks based on the state and push into another queue that will be used to make the 4kb block.
I can keep taking messages from the re-formatted queue and aggregate them until I reach 4kb, or the end of the queue and loop around.
Is there an elegant way to do this in F#?
I remember seeing a snippet where a collection function was used to aggregate data until a certain size but it looked very inefficient as it was making a collection of line1, line1+line2, line1+line2+line3... and then picking the one with the right size.

Restricting number of function iterations

I'm writing a code in Erlang which suppose to generate a random number for a random amount of time and add each number to a list. I managed a function which can generate random numbers and i kinda managed a method for adding it to a list,but my main problem is restricting the number of iterations of the function. I like the function to produce several numbers and add them to the list and then kill that process or something like that.
Here is my code so far:
generator(L1)->
random:seed(now()),
A = random:uniform(100),
L2 = lists:append(L1,A),
generator(L2),
producer(B,L) ->
receive
{last_element} ->
consumer ! {lists:droplast(B)}
end
consumer()->
timer:send_after(random:uniform(1000),producer,{last_element,self()}),
receive
{Answer, Producer_PID} ->
io:format("the last item is:~w~n",[Answer])
end,
consumer().
start() ->
register(consumer,spawn(lis,consumer,[])),
register(producer,spawn(lis,producer,[])),
register(generator,spawn(lis,generator,[random:uniform(10)])).
I know it's a little bit sloppy and incomplete but that's not the case.
First, you should use rand to generate random numbers instead of random, it is an improved module.
In addition, when using rand:uniform/1 you won't need to change the seed every time you run your program. From erlang documentation:
If a process calls uniform/0 or uniform/1 without setting a seed
first, seed/1 is called automatically with the default algorithm and
creates a non-constant seed.
Finally, in order to create a list of random numbers, take a look at How to create a list of 1000 random number in erlang.
If I conclude all this, you can just do:
[rand:uniform(100) || _ <- lists:seq(1, 1000)].
There have some issue in your code:
timer:send_after(random:uniform(1000),producer,{last_element,self()}),, you send {last_element,self()} to producer process, but in producer, you just receive {last_element}, these messages are not matched.
you can change
producer(B,L) ->
receive
{last_element} ->
consumer ! {lists:droplast(B)}
end.
to
producer(B,L) ->
receive
{last_element, FromPid} ->
FromPid! {lists:droplast(B)}
end.
the same reason for consumer ! {lists:droplast(B)} and {Answer, Producer_PID} ->.

How to read all the records of mnesia database in erlang?

I ma new in erlang and I need to do some operations for all records I get from mnesia database.
Result = mnesia:dirty_read(mydatabase, {key1, key2}),
case Result of
[] ->
?DEBUG("No such record found", []);
[#mydatabase{key3 = Key3}] ->
%% some operations
end
How can I add a loop to my code that execute some operations for all records?
I am not even sure if the code above does it or not?
You could use mnesia:foldl/3 for that. It iterates over all records in a table, passing along an "accumulator" value.
It doesn't have an explicit "dirty" counterpart, so if you want to run it as a dirty operation you need to use mnesia:activity/2. (Or you could just use it inside a call to mnesia:transaction.)
In this example, I don't actually do anything with the "accumulator", leaving as ignored_acc throughout.
mnesia:activity(sync_dirty,
fun() ->
mnesia:foldl(
fun(#mydatabase{}, Acc) ->
%% do something with the record here
Acc
end,
ignored_acc,
my_table)
end)
I think you can try all_keys(Tab)
all_keys(Tab) -> KeyList | transaction abort
This function returns a list of all keys in the table named Tab. The
semantics of this function is context sensitive. See mnesia:activity/4
for more information. In transaction context it acquires a read lock
on the entire table.

Setting a limit to a queue size

How does one set a queue to hold N values. When the N is reached, remove the last item and add a value to the front of the queue.
Should this be done with if statement?
I also want to calculate the values within the queue as a new item is added. e.g. add all of the values in the queue.
I assume from your query that you both want to maximize the length of the queue and get the sum of all the values.
To answer your easiest question first: Erlang queues, however you wish to represent them, are normal Erlang data structures so there are no problems in storing them in a dictionary.
The OTP queue module is actually very simple but the plethora of interfaces easily makes it confusing to use. #Nathon's enqueue function can be made much more efficient by not using the queue data structure directly but by defining your own data structure which includes the queue and its current length, {Length,Queue}. If the sum is important then you could even include it as well.
The queue representations are very simple so it is very easy to write your own specialised form of it.
The simplest way is to keep the queue in a list and take elements from the head and add new elements to the end. So :
new(Max) when is_integer(Max), Max > 0 -> {0,Max,[]}. %Length, Max and Queue list
take({L,M,[H|T]}) -> {H,{L-1,M,T}}.
add(E, {L,M,Q}) when L < M ->
{L+1,M,Q ++ [E]}; %Add element to end of list
add(E, {M,M,[H|T]}) -
{M,M,T ++ [E]}. %Add element to end of list
When the queue becomes full the oldest member, which is at the front of the queue, is dropped. An empty queue generates an error. This is a very simple structure but it is inefficient as the queue is copied every time a new element is added. Reversing the list does not help as then the list is copied every time an element is removed from it. But it is simple, and it does work.
A much more efficient structure is to split the queue into two lists, the front end of the queue and the rear end of the queue. The rear end is reversed and becomes the new front when the front is empty. So:
new(Max) when is_integer(Max), Max > 0 ->
{0,Max,[],[]}. %Length, Max, Rear and Front
take({L,M,R,[H|T]}) -> {H,{L-1,M,R,T}};
take{{L,M,R,[]}) when L > 0 ->
take({L,M,[],lists:reverse(R)}). %Move the rear to the front
add(E, {L,M,R,F}) when L < M ->
{L+1,M,[R|E],F}; %Add element to rear
add(E, {M,M,R,[H|T]}) ->
{M,M,[R|E],T}; %Add element to rear
add(E, {M,M,R,[]}) ->
add(E, {M,M,[],lists:reverse(R)}). %Move the rear to the front
Again when the queue becomes full the oldest member, which is at the front of the queue, is dropped and an empty queue generates an error. This is the data structure used in the queue module.
It would be very easy to add the current sum of the elements to the structure and manage it directly.
Often, when working on simple data structures like this, it is just as easy to roll your own module as it is to use a provided one.
Given the comments, this will do it:
enqueue(Value, Queue) ->
Pushed = queue:in(Value, Queue),
Sum = fun (Q) -> lists:sum(queue:to_list(Q)) end,
case queue:len(Pushed) of
Len when Len > 10 ->
Popped = queue:drop(Pushed),
{Popped, Sum(Popped)};
_ ->
{Pushed, Sum(Pushed)}
end.
If you don't actually want to sum the items, you can use lists:foldl instead, or just write a function to do the operation directly on a queue.

Erlang: erl shell hangs after building a large data structure

As suggested in answers to a previous question, I tried using Erlang proplists to implement a prefix trie.
The code seems to work decently well... But, for some reason, it doesn't play well with the interactive shell. When I try to run it, the shell hangs:
> Trie = trie:from_dict(). % Creates a trie from a dictionary
% ... the trie is printed ...
% Then nothing happens
I see the new trie printed to the screen (ie, the call to trie:from_dict() has returned), then the shell just hangs. No new > prompt comes up and ^g doesn't do anything (but ^c will eventually kill it off).
With a very small dictionary (the first 50 lines of /usr/share/dict/words), the hang only lasts a second or two (and the trie is built almost instantly)... But it seems to grow exponentially with the size of the dictionary (100 words takes 5 or 10 seconds, I haven't had the patience to try larger wordlists). Also, as the shell is hanging, I notice that the beam.smp process starts eating up a lot of memory (somewhere between 1 and 2 gigs).
So, is there anything obvious that could be causing this shell hang and incredible memory usage?
Some various comments:
I have a hunch that the garbage collector is at fault, but I don't know how to profile or create an experiment to test that.
I've tried profiling with eprof and nothing obvious showed up.
Here is my "add string to trie" function:
add([], Trie) ->
[ stop | Trie ];
add([Ch|Rest], Trie) ->
SubTrie = proplists:get_value(Ch, Trie, []),
NewSubTrie = add(Rest, SubTrie),
NewTrie = [ { Ch, NewSubTrie } | Trie ],
% Arbitrarily decide to compress key/value list once it gets
% more than 60 pairs.
if length(NewTrie) > 60 ->
proplists:compact(NewTrie);
true ->
NewTrie
end.
The problem is (amongst others ? -- see my comment) that you are always adding a new {Ch, NewSubTrie} tuple to your proplist Tries, no matter if Ch already existed, or not.
Instead of
NewTrie = [ { Ch, NewSubTrie } | Trie ]
you need something like:
NewTrie = lists:keystore(Ch, 1, Trie, {Ch, NewSubTrie})
You're not really building a trie here. Your end result is effectively a randomly ordered proplist of proplists that requires full scans at each level when walking the list. Tries are typically implied ordering based on position in the array (or list).
Here's an implementation that uses tuples as the storage mechanism. Calling set only rebuilds the root and direct path tuples.
(note: would probably have to make the pair a triple (adding size) make delete work with any efficiency)
I believe erlang tuples are really just arrays (thought I read that somewhere), so lookup should be super fast, and modify is probably straight forward. Maybe this is faster with the array module, but I haven't really played with it much to know.
this version also stores an arbitrary value, so you can do things like:
1> c(trie).
{ok,trie}
2> trie:get("ab",trie:set("aa",bar,trie:new("ab",foo))).
foo
3> trie:get("abc",trie:set("aa",bar,trie:new("ab",foo))).
undefined
4>
code (entire module): note2: assumes lower case non empty string keys
-module(trie).
-compile(export_all).
-define(NEW,{ %% 26 pairs, to avoid cost of calculating a new level at runtime
{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},
{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},
{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},
{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},
{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},
{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},{undefined,nodepth},
{undefined,nodepth},{undefined,nodepth}
}
).
-define(POS(Ch), Ch - $a + 1).
new(Key,V) -> set(Key,V,?NEW).
set([H],V,Trie) ->
Pos = ?POS(H),
{_,SubTrie} = element(Pos,Trie),
setelement(Pos,Trie,{V,SubTrie});
set([H|T],V,Trie) ->
Pos = ?POS(H),
{SubKey,SubTrie} = element(Pos,Trie),
case SubTrie of
nodepth -> setelement(Pos,Trie,{SubKey,set(T,V,?NEW)});
SubTrie -> setelement(Pos,Trie,{SubKey,set(T,V,SubTrie)})
end.
get([H],Trie) ->
{Val,_} = element(?POS(H),Trie),
Val;
get([H|T],Trie) ->
case element(?POS(H),Trie) of
{_,nodepth} -> undefined;
{_,SubTrie} -> get(T,SubTrie)
end.

Resources