I have seen many GenServer implementations, I am trying to create one with such specifications, But I am not sure its GenServer's use case.
I have a state such as
%{url: "abc.com/jpeg", name: "Camera1", id: :camera_one, frequency: 10}
I have such 100 states, with different values, my use case contains on 5 steps.
Start Each state as a Gen{?}.
Send an HTTP request to that URL.
Get results.
Send another HTTP request with the data came from the first request.
Put the Process to sleep. if the frequency is 10 then for 10 seconds and so on and after 10 seconds It will start from 1 Step again.
Now when I will start 100 such workers, there are going to be 100 * 2 HTTP requests with frequency. I am not sure about either I am going to use GenServer or GenStage or Flow or even Broadway?
I am also concerned the HTTP requests won't collapse such as one worker, with a state, will send a request and if the frequency is 1 Second before the first request comes back, the other request would have been sent, would GenServer is capable enough to handle those cases? which I think are called back pressure?
I have been asking and looking at this use case of so long, I have been guided towards RabbitMQ as well for my use case.
Any guidance would be so helpful, or any minimal example would be so grateful.
? GenServer/ GenStage / GenStateMachine
Your problem comes down to reducing the number of concurrent network requests at a given time.
A simple approach would be to have a GenServer which keeps track of the count of outgoing requests. Then, for each client (Up to 200 in your case), it can check to see if there's an open request, and then act accordingly. Here's what the server could look like:
defmodule Throttler do
use GenServer
#server
#impl true
def init(max_concurrent: max_concurrent) do
{:ok, %{count: 0, max_concurrent: max_concurrent}}
end
#impl true
def handle_call(:run, _from, %{count: count, max_concurrent: max_concurrent} = state) when count < max_concurrent, do: {:reply, :ok, %{state | count: count + 1}}
#impl true
def handle_call(:run, _from, %{count: count, max_concurrent: max_concurrent} = state) when count >= max_concurrent, do: {:reply, {:error, :too_many_requests}, state}
#impl true
def handle_call(:finished, _from, %{count: count} = state) when count > 0, do: {:reply, :ok, %{state | count: count - 1}}
end
Okay, so now we have a server where we can call handle_call(pid, :run) and it will tell us whether or not we've exceeded the count. Once the task (getting the URL) is complete, we need to call handle_call(pid, :finished) to let the server know we've completed the task.
On the client side, we can wrap that in a convenient helper function. (Note this is still within the Throttler module so __MODULE__ works)
defmodule Throttler do
#client
def start_link(max_concurrent: max_concurrent) when max_concurrent > 0 do
GenServer.start_link(__MODULE__, [max_concurrent: max_concurrent])
end
def execute_async(pid, func) do
GenServer.call(pid, :run)
|> case do
:ok ->
task = Task.async(fn ->
try do
func.()
after
GenServer.call(pid, :finished)
end
end)
{:ok, task}
{:error, reason} -> {:error, reason, func}
end
end
end
Here we pass in a function that we want to asynchronously execute on the client side, and do the work of calling :run and :finished on the server side before executing. If it succeeds, we get a task back, otherwise we get a failure.
Putting it all together, and you get code that looks like this:
{:ok, pid} = Throttler.start_link(max_concurrent: 3)
results = Enum.map(1..5, fn num ->
Throttler.execute(pid, fn ->
IO.puts("Running command #{num}")
:timer.sleep(:5000)
IO.puts("Sleep complete for #{num}")
num * 10
end)
end)
valid_tasks = Enum.filter(results, &(match?({:ok, _func}, &1))) |> Enum.map(&elem(&1, 1))
Now you have a bunch of tasks that either succeeded, or failed and you can act appropriately.
What do you do upon failure? That's the interesting part of backpressure :) The simplest thing would be to have a timeout and retry, under the assumption that you will eventually clear the pressure downstream. Otherwise you can fail out the requests entirely and keep pushing the problem upstream.
Working in lua, I have a table of key/value pairs
local fmtsToRun = {
name = function()
return configSubTable
end
}
Which could be 1 or more entries in length. I need to loop through each entry and run a subprocess (through some libuv C bindings).
With a normal for loop, the subprocess results from libuv return after the loop has finished running, leading to things showing up out of order. The result would be
Loop Start
Loop Entry 1
Job 1 starts
Loop Entry 2
Job 2 starts
Loop Ends
Job1 Returns
Job2 Returns
What I need to have is
Loop Start
Loop Entry 1
Job 1 starts
Job1 Returns
Loop Entry 2
Job 2 starts
Job2 Returns
Loop Ends
I've also tried writing my own version of pairs() to and using something like coroutines to handle the callbacks
for fmt, output in jobIterator(fmtsToRun) do
print('finished running', output)
end
local function jobIterator(tbl)
return coroutine.wrap(function()
local fmtConf, fmtName
fmtName, fmtConf = next(tbl, fmtName)
if nil~=fmtConf then
local conf = fmtConf()
local output = nil
-- wrapper util from Libuv library
local job = Job:new({
cmd = conf.cmd,
args = conf.args,
on_stdout = onStdout, -- process output
on_stderr = onStderr, -- process any error
on_exit = function()
coroutine.yield(fmtName, output)
end
})
job.send(conf.data)
end
end)
end
which leads to this error messages.
attempt to yield across C-call boundary
What would be the "right" way to wait for the job to finish and continue the loop while maintaining the correct order?
A better option was to manually control the flow and recursively call a function to step through the table one entry at a time.
local runner = {}
for _, output in pairs(fmtsToRun) do
table.insert(runner, output)
end
jobIterator(runner)
local function jobIterator(tbl)
local F = {}
function F.step()
if #tbl == 0 then
return
end
local current = table.remove(tbl, 1)
F.run(current)
end
function F.run(conf)
local output = nil
-- wrapper util from Libuv library
local job =
Job:new(
{
cmd = conf.cmd,
args = conf.args,
on_stdout = onStdout, -- process output
on_stderr = onStderr, -- process any error
on_exit = function()
-- do what you need with output
print(output)
F.step()
end
}
)
job.send(conf.data)
end
F.step()
end
So this is my code :
defmodule Parent do
def arun(pid) do
:ets.new(:my_table,[:named_table, :set, :public, read_concurrency: true])
:ets.give_away(:my_table, pid, [])
end
def receiver do
pid = spawn(fn -> arun(self()) end)
receive do
{'ETS-TRANSFER',_,_,_} ->
IO.puts "ets got transferred"
_ ->
IO.puts "I dont know what happened"
end
end
end
But when I am trying to compile this runtime error is coming .
iex(31)> Parent.receiver
17:37:19.183 [error] Process #PID<0.204.0> raised an exception
** (ArgumentError) argument error
(stdlib) :ets.give_away(:my_table, #PID<0.204.0>, [])
parent.ex:4: Parent.arun/1
Also can someone tell me proper way to make a ets table and give its ownership to other process ?
I am trying to this :
Parent process will create a asynchronous task which will create a ets table and then this task/process will return ownership back to the parent process.
The problem is on this line:
pid = spawn(fn -> arun(self()) end)
You're trying to spawn a new process which calls a function with the parent pid as the argument, but since the call to self() is inside the spawn, you get the child pid instead. (And if a process tries to give away an ETS table to itself, it gets an "argument error".)
Try this:
parent = self()
pid = spawn(fn -> arun(parent) end)
I am implementing the Gossip Algorithm in which multiple actors spread a gossip at the same time in parallel. The system stops when each of the Actor has listened to the Gossip for 10 times.
Now, I have a scenario in which I am checking the listen count of the recipient actor before sending the gossip to it. If the listen count is already 10, then gossip will not be sent to the recipient actor. I am doing this using synchronous call to get the listen count.
def get_message(server, msg) do
GenServer.call(server, {:get_message, msg})
end
def handle_call({:get_message, msg}, _from, state) do
listen_count = hd(state)
{:reply, listen_count, state}
end
The program runs well in the starting but after some time the Genserver.call stops with a timeout error like following. After some debugging, I realized that the Genserver.call becomes dormant and couldn't initiate corresponding handle_call method. Is this behavior expected while using synchronous calls? Since all actors are independent, shouldn't the Genserver.call methods be running independently without waiting for each others response.
02:28:05.634 [error] GenServer #PID<0.81.0> terminating
** (stop) exited in: GenServer.call(#PID<0.79.0>, {:get_message, []}, 5000)
** (EXIT) time out
(elixir) lib/gen_server.ex:774: GenServer.call/3
Edit: The following code can reproduce the error when running in iex shell.
defmodule RumourActor do
use GenServer
def start_link(opts) do
{:ok, pid} = GenServer.start_link(__MODULE__,opts)
{pid}
end
def set_message(server, msg, recipient) do
GenServer.cast(server, {:set_message, msg, server, recipient})
end
def get_message(server, msg) do
GenServer.call(server, :get_message)
end
def init(opts) do
state=opts
{:ok,state}
end
def handle_cast({:set_message, msg, server, recipient},state) do
:timer.sleep(5000)
c = RumourActor.get_message(recipient, [])
IO.inspect c
{:noreply,state}
end
def handle_call(:get_message, _from, state) do
count = tl(state)
{:reply, count, state}
end
end
Open iex shell and load above module. Start two processes using:
a = RumourActor.start_link(["", 3])
b = RumourActor.start_link(["", 5])
Produce error by calling a deadlock condition as mentioned by Dogbert in comments. Run following without much time difference.
cb = RumourActor.set_message(elem(a,0), [], elem(b,0))
ca = RumourActor.set_message(elem(b,0), [], elem(a,0))
Wait for 5 seconds. Error will appear.
A gossip protocol is a way of dealing with asynchronous, unknown, unconfigured (random) networks that may be suffering intermittent outages and partitions and where no leader or default structure is present. (Note that this situation is somewhat unusual in the real world and out-of-band control is always imposed on systems in some way.)
With that in mind, let's change this to be an asynchronous system (using cast) so that we are following the spirit of the concept of chatty gossip style communication.
We need digest of messages that counts how many times a given message has been received, a digest of messages that have been received and are already over the magic number (so we don't re-send one if it is way late), and a list of processes enrolled in our system so we know to whom we are broadcasting:
(The following example is in Erlang because I just trip over Elixir syntax ever since I stopped using it...)
-module(rumor).
-record(s,
{peers = [] :: [pid()],
digest = #{} :: #{message_id(), non_neg_integer()},
dead = sets:new() :: sets:set(message_id())}).
-type message_id() :: zuuid:uuid().
Here I am using a UUID, but it could be whatever. An Erlang reference would be fine for a test case, but since gossip isn't useful within an Erlang cluster, and references are unsafe outside the originating system I'm just jumping to the assumption this is for a networked system.
We will need an interface function that allows us to tell a process to inject a new message into the system. We will also need an interface function that sends a message between two processes once it is already in the system. Then we will need an inner function that broadcasts messages to all the known (subscribed) peers. Ah, that means we need a greeting interface so that peer processes can notify each other of their presence.
We will also want a way to have a process tell itself to keep broadcasting over time. How long to set the interval on retransmission is not actually a simple decision -- it has everything to do with network topology, latency, variability, etc (you would actually probably occasionally ping peers and develop some heuristic based on the latency, drop peers that seem unresponsive, and so on -- but we're not going to get into that madness here). Here I'm just going to set it for 1 second because that is an easy to interpret interval for humans observing the system.
Note that everything below is asynchronous.
Interfaces...
insert(Pid, Message) ->
gen_server:cast(Pid, {insert, Message}).
relay(Pid, ID, Message) ->
gen_server:cast(Pid, {relay, ID, Message}).
greet(Pid) ->
gen_server:cast(Pid, {greet, self()}).
make_introduction(Pid, PeerPid) ->
gen_server:cast(Pid, {make_introduction, PeerPid}).
That last function is going to be our way as testers of the system to cause one of the processes to call greet/1 on some target Pid so they start to build a peer network. In the real world something slightly different usually goes on.
Inside our gen_server callback for receiving a cast we will get:
handle_cast({insert, Message}, State) ->
NewState = do_insert(Message, State);
{noreply, NewState};
handle_cast({relay, ID, Message}, State) ->
NewState = do_relay(ID, Message, State),
{noreply, NewState};
handle_cast({greet, Peer}, State) ->
NewState = do_greet(Peer, State),
{noreply, NewState};
handle_cast({make_introduction, Peer}, State) ->
NewState = do_make_introduction(Peer, State),
{noreply, NewState}.
Pretty simple stuff.
Above I mentioned that we would need a way for this thing to tell itself to resend after a delay. To do that we are going to send ourselves a naked message to "redo_relay" after a delay using erlang:send_after/3 so we are going to need a handle_info/2 to deal with it:
handle_info({redo_relay, ID, Message}, State) ->
NewState = do_relay(ID, Message, State),
{noreply, NewState}.
Implementation of the message bits is the fun part, but none of this is terribly tricky. Forgive the do_relay/3 below -- it could be more concise, but I'm writing this in a browser off the top of my head, so...
do_insert(Message, State = #s{peers = Peers, digest = Digest}) ->
MessageID = zuuid:v1(),
NewDigest = maps:put(MessageID, 1, Digest),
ok = broadcast(Message, Peers),
ok = schedule_resend(MessageID, Message),
State#s{digest = NewDigest}.
do_relay(ID,
Message,
State = #s{peers = Peers, digest = Digest, dead = Dead}) ->
case maps:find(ID, Digest) of
{ok, Count} when Count >= 10 ->
NewDigest = maps:remove(ID, Digest),
NewDead = sets:add_element(ID, Dead),
ok = broadcast(Message, Peers),
State#s{digest = NewDigest, dead = NewDead};
{ok, Count} ->
NewDigest = maps:put(ID, Count + 1),
ok = broadcast(ID, Message, Peers),
ok = schedule_resend(ID, Message),
State#s{digest = NewDigest};
error ->
case set:is_element(ID, Dead) of
true ->
State;
false ->
NewDigest = maps:put(ID, 1),
ok = broadcast(Message, Peers),
ok = schedule_resend(ID, Message),
State#s{digest = NewDigest}
end
end.
broadcast(ID, Message, Peers) ->
Forward = fun(P) -> relay(P, ID, Message),
lists:foreach(Forward, Peers).
schedule_resend(ID, Message) ->
_ = erlang:send_after(1000, self(), {redo_relay, ID, Message}),
ok.
And now we need the social bits...
do_greet(Peer, State = #s{peers = Peers}) ->
case lists:member(Peer, Peers) of
false -> State#s{peers = [Peer | Peers]};
true -> State
end.
do_make_introduction(Peer, State = #s{peers = Peers}) ->
ok = greet(Peer),
do_greet(Peer, State).
So what did all of the horribly untypespecced stuff up there do?
It avoided any possibility of a deadlock. The reason deadlocks are so, well, deadly in peer systems is that anytime you have two identical processes (or actors, or whatever) communicating synchronously, you have created a textbook case of a potential deadlock.
Any time A has a synchronous message headed toward B and B has a synchronous message headed toward A at the same time you now have a deadlock. There is no way to create to identical processes that call each other synchronously without creating a potential deadlock. In massively concurrent systems anything that might happen almost certainly will eventually, so you're going to run into this sooner or later.
Gossip is intended to be asynchronous for a reason: it is a sloppy, unreliable, inefficient way to deal with a sloppy, unreliable, inefficient network topology. Trying to make calls instead of casts not only defeats the purpose of gossip-style message relay, it also pushes you into impossible deadlock territory incident to changing the nature of the protocol from asynch to synch.
Genser.call has a default timeout of 5000 milliseconds. So what probably happening is, the message queue of the actor is filled with millions of messages and by the time it reaches to call, the calling actor has timed out.
You can handle timeout using a try...catch:
try do
c = RumourActor.get_message(recipient, [])
catch
:exit, reason ->
# handle timeout
Now, the called actor will finally get to the call message and respond, which will come as an unexpected message to the first process. This you'll need to handle using handle_info. So one way is to ignore the error in catch block and send it rumor from handle_info.
Also, this will significantly degrade the performance if there are many process waiting to be timed-out for 5 seconds before moving ahead. One could deliberately reduce the timeout and handle the reply in handle_info. This will reduce to using cast and handling reply from other process.
Your blocking call need to be broken into two non blocking calls. So if A is making a blocking call to B, instead of waiting for reply, A can ask B to send its state on a given address (A's address) and move on.
Then A will handle that message separately and reply if necessary.
A.fun1():
body of A before blocking call
result = blockingcall()
do things based on result
needs to be divided into:
A.send():
body of A before blocking call
nonblockingcall(A.receive) #A.receive is where B should send results
do other things
A.receive(result):
do things based on result
I am building a GenServer in Elixir, let's say it's a simple Queue like this
defmodule Queue do
use GenServer
def start_link(name) do
GenServer.start_link(__MODULE__, :ok, name: name)
end
def push(server, msg) do
GenServer.call(server, {:subscribe, channel, last_id})
end
def pop(server) do
GenServer.call(server, {:pop, channel, last_id})
end
# handlers here ...
end
For this queue, I wan to provide different storage backends, like
PostgreSQL
Redis
In-memory
File
So and so on. Here's the question, how can I do dependency injection with this GenServer? Ideally I want to create a Queue with different backend like this for database backend
{:ok, db_queue} = Queue.start_link(:DBQueue, db_process_pid)
and for redis may like this
{:ok, redis_queue} = Queue.start_link(:RedisQueue, redis_process_pid)
In this way, I can create the same queue server with different backend. What is the best practice for Elixir to do dependency injection for a GenServer?
If you only want to store a pid, you can use the GenServer state to store it. You can then access it from the handle_* callback functions. For example, here's how the Queue would be like:
defmodule Queue do
use GenServer
def start_link(name, pid) do
GenServer.start_link(__MODULE__, pid, name: name)
end
def init(pid), do: {:ok, pid}
...
end
You can now start it like you want to:
{:ok, db_queue} = Queue.start_link(:DBQueue, db_process_pid)
And then in your handle_* callbacks, access the state (which is just a pid in this case) from the last argument:
def handle_call({:subscribe, channel, last_id}, from, pid) do
# use the pid and the message to construct a reply
reply = ...
{:reply, reply, pid}
end
If you want to store more things, like say a module as well, just change pid everywhere to a map like %{pid: ..., module: ...}. Any Erlang term can be used as the state.