I have Phoenix application with complex business logic behind HTTP endpoint. This logic includes interaction with database and few external services and once request processing has been started it must not be interrupted till all operations will be done.
But it seems like Cowboy or Ranch kills request handler process (Phoenix controller) if client suddenly closes connection, which leads to partially executed business process. To debug this I have following code in controller action:
Process.flag(:trap_exit, true)
receive do
msg -> Logger.info("Message: #{inspect msg}")
after
10_000 -> Logger.info("Timeout")
end
And to simulate connection closing I set timeout: curl --request POST 'http://localhost:4003' --max-time 3.
After 3 seconds in IEx console I see that process is about to exit: Message: {:EXIT, #PID<0.4196.0>, :shutdown}.
So I need to make controller complete its job and reply to client if it is still there or do nothing if connection is lost. Which will be the best way to achieve this:
trap exits in controller action and ignore exit messages;
spawn not linked Task in controller action and wait for its results;
somehow configure Cowboy/Ranch so it will not kill handler process, if it is possible (tried exit_on_close with no luck)?
Handling processes will be killed after the request end, that is their purpose. If you want to process some data in the background, then start additional process. The simplest way to do so would be 2nd method you have proposed, but with slight modification of using Task.Supervisor.
So in your application supervisor you start Task.Supervisor with name of your choice:
children = [
{Task.Supervisor, name: MyApp.TaskSupervisor}
]
Supervisor.start_link(children, strategy: :one_for_one)
And then in your request handler:
parent = self()
ref = make_ref()
Task.Supervisor.start_child(MyApp.TaskSupervisor, fn() ->
send(parent, {ref, do_long_running_stuff()})
end)
receive do
{^ref, result} -> notify_user(result)
end
That way you do not need to worry about handling situation when user is no longer there to receive message.
Related
I desire to stream data with yaws to my comet application, I have read and worked around to understand it but the example from yaws seems to be a little complicated for me (I am new to Erlang). I just cannot get my head around...
here is the example from yaws (I modified a little bit):
out(A) ->
%% Create a random number
{_A1, A2, A3} = now(),
random:seed(erlang:phash(node(), 1),
erlang:phash(A2, A3),
A3),
Sz = random:uniform(1),
Pid = spawn(fun() ->
%% Read random junk
S="Hello World",
P = open_port({spawn, S}, [binary,stream, eof]),
rec_loop(A#arg.clisock, P)
end),
[{header, {content_length, Sz}},
{streamcontent_from_pid, "text/html; charset=utf-8", Pid}].
rec_loop(Sock, P) ->
receive
{discard, YawsPid} ->
yaws_api:stream_process_end(Sock, YawsPid);
{ok, YawsPid} ->
rec_loop(Sock, YawsPid, P)
end,
port_close(P),
exit(normal).
rec_loop(Sock, YawsPid, P) ->
receive
{P, {data, BinData}} ->
yaws_api:stream_process_deliver(Sock, BinData),
rec_loop(Sock, YawsPid, P);
{P, eof} ->
yaws_api:stream_process_end(Sock, YawsPid)
end.
What I need is to transform the above script to which can be combined with the following.
mysql:start_link(p1, "127.0.0.1", "root", "azzkikr", "mydb"),
{data, Results} = mysql:fetch(p1, "SELECT*FROM messages WHERE id > " ++ LASTID),
{mysql_result, FieldNames, FieldValues, NoneA, NoneB} = Results,
parse_data(FieldValues, [], [], [], [], [])
Where parse_data(FieldValues, [], [], [], [], []) returns a JSON string of the entry..
Combined this script should constantly check for a new entry into database and if there is, it should fetch as comet should.
Thank you, May you all go to paradise!
As this answer explains, sometimes you need to have a process running that's independent of any incoming HTTP requests. For your case, you can use a form of publish/subscribe:
Publisher: when your Erlang node starts up, start some sort of database client process, or a pool of such processes, executing your query and running independently of Yaws.
Subscriber: when Yaws receives an HTTP request and dispatches it to your code, your code subscribes to the publisher. When the publisher sends data to the subscriber, the subscriber streams them back to the HTTP client.
Detailing a full solution here is impractical, but the general steps are:
When your database client processes start, they register themselves into a pg2 group or something similar. Use something like poolboy instead of rolling your own process pools, as they're notoriously tricky to get right. Each database client can be an instance of a gen_server running a query, receiving database results, and also handling subscription request calls.
When your Yaws code receives a request, it looks up a database client publisher process and subscribes to it. Subscriptions require calling a function in the database client module, which in turn uses gen_server:call/2,3 to communicate with the actual gen_server publisher process. The subscriber uses Yaws streaming capabilities (or SSE or WebSocket) to complete the connection with the HTTP client and sends it any required response headers.
The publisher stores the process ID of the subscriber, and also establishes a monitor on the subscriber so it can clean up the subscription should the subscriber die or exit unexpectedly.
The publisher uses the monitor's reference as a unique ID in its messages it sends to that subscriber, so the subscription function returns that reference to the subscriber. The subscriber uses the reference to match incoming messages from the publisher.
When the publisher gets new query results from the database, it sends the data to each of its subscribers. This can be done with normal Erlang messages.
The subscriber uses Yaws streaming functions (or SSE or WebSocket features) to send query results to the HTTP client.
When an HTTP client disconnects, the subscriber calls another publisher function to unsubscribe.
To continue on with my journey to Erlands I'm developing simple IM system using OTP.
There are two OTP applications: a server (one instance) and a client (multiple instances). A set-up is shown below:
╭── node1#host ──╮
│ Server │
│ └gen_server │
╰────────────────╯
╭── node2#host ──╮
│ Client │
│ └gen_server │
╰────────────────╯
╭── node3#host ──╮
│ Client │
│ └gen_server │
╰────────────────╯
...
Client functions
Using Erlang shell, we can issue next commands to the client application:
Connect to the server and receive a random name from it (I'm fond of names like turbo-octopus, miniature-octocat etc. :)
Get a list of other connected clients.
Send a message to the client with specified name.
Send a message to all clients (broadcast).
Also client should be able to print message in stdout upon receiving.
Implementation details
All messages go through server.
Both server and client applications contain gen_servers (chat_server.erl and chat_client.erl respectively) responsible for handling messages. Server's chat_server process registered as global and visible on all nodes:
%% chat_server.erl
start_link() ->
gen_server:start_link({global, ?SERVER}, ?MODULE, [], []).
When a client connects, it sends pid of its gen_server process. Doing this, we can store clients pids in server's state to distinguish them and sending/broadcasting messages.
%% chat_client.erl
connect() ->
Res = gen_server:call({global, ?REMOTE_SERVER}, {connect, client_pid() ...}),
...
%% pid of the client's gen_server
client_pid() -> whereis(?CLIENT_SERVER).
Server connect handle:
%% chat_server.erl
handle_call({connect, Pid}, _From, State) ->
%% doing stuff like generating unique name,
%% adding client to list, etc.
{reply, {connected, Name}, UpdatedState}.
Messaging (pun intended)
Well, it's pretty straightforward. The server handles cast from a client, seeks recipient's pid by given name and cast message to it/broadcasts to everyone. And this is it.
A Question will be asked
While developing this system, I wondered if a chosen approach is appropriate. I mean,
Passing around client's gen_server pid seems more or less acceptable at least because it allows uniquely identify clients and use all gen_server firepower on both ends. Is this the way you do it?
I have read here and there that explicit interface (calling exported functions) is preferable to direct messaging (thing I do in my client with gen_server:calls). Is there any way to fix this (rpc for example), or it's okay?
Given the same set-up (a node with a server application and N nodes with clients), will you use the same approach with gen_servers, or there is a better approach I'm unaware of?
Personally I think your architecture is slightly off.
If you want your client to accept incoming messages (e.g. when another client is sending a message to you or a doing a broadcast) then there currently doesn't seem to be a process where the server can send the messages to. gen_server is typically not the vehicle for that; it's mainly for server processes.
I think the idea should be to start a new process for every client. The process will become that specific client's main loop. If you (the user) wants to do something, you send a message to that specific process. That can be hidden behind function calls. Then the client's main loop will interact with the server.
The client's main loop - which is a separate process, is always ready to receive messages, so the server can send messages to your client if someone is sending to you.
BTW: I hope that your defines ?SERVER and ?REMOTE_SERVER are identical because, if I understand correctly, they both refer to the globally registered chat server. Better stick with one unique name.
Another issue is that you do not typically expose the gen_server:call() methods. The clients only call methods in the chat_server module without knowing what the name of the server is or whereever it lives (that's the beauty of Erlang!).
In chat_server.erl you put code like this; basically the client API. You will notice that in chat_client.erl there will only be calls to methods in the chat_server module. Very clean and transparent!
%% let a new client connect, all we need is it's Pid
new_client(Pid) ->
gen_server:call({global, ?SERVER}, {connect, Pid}).
send_msg(From, To) ->
gen_server:call({global, ?SERVER}, {sendmsg, From, To}).
logout_client(Pid) ->
gen_srver:call({global, ?SERVER), {exit_client, Pid}).
The client code below (deliberately) does not automatically register the client's Pid, unless you restrict your system to allow exactly only one client per node. You cannot register more than one Pid under the same name.
The code below does not register the new Pid as a name but it can be made to do so trivially, if that is what you desire or need.
Typically the client's code looks like:
%% start a new client, we spawn a new process for this
%% particular client and return their Pid, to be used
%% when you want your client to do something
connect(Server) ->
spawn( ?MODULE, start_client, [] ).
%% client startup code
start_client() ->
%% Initialize client state, if you wish
State = 42,
%% Now connect to chat server
chat_server:new_client( self() ),
%% And fall into our own main loop
client_loop( State ).
%% This is the client's main loop
client_loop( State ) ->
%% Wait for stuff to happen ...
receive
%% chat server sends message to us
{message , Msg, From} ->
io:format("~p sais ~p~n", [From, Msg]),
client_loop( State );
%% message sending is delegated to the server - see your own protocol
{send, Msg, To} ->
chat_server:send_msg(Msg, To),
client_loop( State );
%% terminate?
done ->
%% de-register with server
chat_server:logout_client(self())
end.
Now all that's needed is some utility functions to interact with your client process like below. Note that if you go to "each Erlang node is a single client" by registering the client's Pid locally, you can get rid of passing the Pid explicitly. But the mechanics remain the same.
send_message(Pid, Msg, To) ->
Pid ! {send, Msg, To}.
logout(Pid) ->
Pid ! done.
%% If you force your client's Pid to be registered to e.g. 'registered_name'
%% it would look like
send_message(Msg, To) ->
registered_name ! {send, Msg, To}.
I agree with haavee that your architecture is not what I'd expect, but that's because I'm imagining something more low-level-TCP.
Regarding your questions:
Passing around client's gen_server pid seems more or less acceptable
at least because it allows uniquely identify clients and use all
gen_server firepower on both ends. Is this the way you do it?
Sure, I see nothing wrong with that part of the code. Your server keeps a mapping between PID and client name and it's like calling register/2 but only the server gets the mapping and you control how it works.
I have read here and there that explicit interface (calling exported
functions) is preferable to direct messaging (thing I do in my client
with gen_server:calls). Is there any way to fix this, or it's okay?
If you compiled your client and server applications together (one code-base, two entry-points) then you could do this. Instead of, on the client-side doing
connect() ->
Res = gen_server:call({global, ?REMOTE_SERVER}, {connect, client_pid() ...}),
you'd have
-module(client).
connect() ->
server:client_connect(client_pid()).
and
-module(server).
client_connect(ClientPID) ->
Res = gen_server:call({global, ?REMOTE_SERVER}, {connect, ClientPID ...}).
But if you want to use net_kernel to connect nodes and you want to compile the source code independently, then your way is how you do it.
Given the same set-up (a node with a server application and N nodes
with clients), will you use the same approach with gen_servers, or
there is a better approach I'm unaware of?
What you're doing with net_kernel is building a distributed system. If you expect a few clients, that's fine. If you expect a ton of clients, then you have to remember that distributed Erlang defaults to a totally-connected mesh. So all your clients are actually connected to each other, as well as the server.
When I look at your description, I imagine a chat-server, and for this I would use gen_tcp for networking instead of net_kernel.
Advantages of net_kernel:
It's very high-level. You don't need to think much about connection drops, and messages are very pure-Erlang.
It's easier to debug. You can use the rpc module from a shell to run anything on any connected node, which is cool.
Advantages of gen_tcp:
Server and clients are less connected. You could swap out a client or the server for a different version with the same network API (including swapping for something non-Erlang) and nobody else would know or care.
Clients aren't interconnected (you can also do this with hidden nodes)
You can use popular port numbers to get past dumb firewalls
I'd put your "client" and "server" modules both on the server machine. You listen for TCP connections and spawn off a "client" for each connection. The "client" module's job is to translate between the remote client talking over the network and the "server" module, talking over Erlang messages.
I have seen many examples of chat room systems over websocket implemented with erlang and cowboy.
Most of the examples I have seen use gproc. In practice each websocket handler registers itself with gproc and then broadcasts/receives messages from it.
Since a user could close by accident the webpage I am thinking about connecting to the websocket handler a gen_fsm which actually broadcasts/receives all the messages from gproc. In this way the gen_fsm could switch from a "connected" state to a "disconnected" state whenever the user exits and still buffer all the messages. After a while if the user is not back online the gen_fsm will terminate.
Is this a good solution? How can I make the new websocket handler to recover the gen_fsm process? Should I register the gen_fsm using the user name or is there any better solution?
What i do is the folowing :
When an user connects to the site, i swpawn a gen_server reprensenting the user. Then, the gen server registers itself in gproc as {n,l, {user, UserName}}. (It can register properties like {p,l, {chat, ChannelID}} to listen to chat channels. (see gproc pub/sub))
Ok so now the user websocket connection starts the cowboy handler (i use Bullet). The handlers asks gproc the pid() of the user's gen_server and registrers itself as a receiver of messages. So now, when the user gen_server receives messages, it redirects them to the websocket handler.
When the websocket connexion ends, the handler uregister from the user gen_server, so the user gen_server will keep messages until the next connection, or the next timeout. At the timeout, you can simply terminate the server (messages will be lost but it is ok).
See : (not tested)
-module(user_chat).
-record(state, {mailbox,receiver=undefined}).
-export([start_link/1,set_receiver/1,unset_receiver/1]).
%% API
start_link(UserID) ->
gen_server:start_link(?MODULE,[UserID],[]).
set_receiver(UserID) ->
set_receiver(UserID,self()).
unset_receiver(UserID) ->
%% Just set the receiver to undefined
set_receiver(UserID,undefined).
set_receiver(UserID, ReceiverPid) ->
UserPid = gproc:where({n,l,UserID}),
gen_server:call(UserPid,{set_receiver,ReceiverPid}).
%% Gen server internals
init([UserID]) ->
gproc:reg({n,l,{user,UserID}}),
{ok,#state{mailbox=[]}}.
handle_call({set_receiver,ReceiverPid},_From,#state{mailbox=MB}=State) ->
NewMB = check_send(MB,State),
{reply,ok,State#state{receiver=ReceiverPid,mailbox=NewMB}}.
handle_info({chat_msg,Message},#state{mailbox=MB}=State) ->
NewMB = check_send([Message|MB],State),
{noreply, State#state{mailbox=NewMB}}.
%% Mailbox empty
check_send([],_) -> [];
%% Receiver undefined, keep messages
check_send(Mailbox,#state{receiver=undefined}) -> Mailbox
%% Receiver is a pid
check_send(Mailbox,#state{receiver=Receiver}) when is_pid(Receiver) ->
%% Send all messages
Receiver ! {chat_messages,Mailbox},
%% Then return empty mailbox
[].
With the solution you propose you may have many processes pending and you will have to write a "process cleaner" for all user that never come back. Anyway it will not support a shutdown of the chat server VM, all messages stored in living FSM will vanish if the node is down.
I think that a better way should be to store all messages in a database like mnesia, with sender, receiver, expiration date... and check for any stored message at connection, and have a message cleaner process to destroy all expired messages from time to time.
Need to implement sync call from proces which receives many incoming messages from other processes. Problem in distinguish - when msg in return to call arrived. Do i need to spawn additional process for extracting msgs from queue into buffer while return msg not encountered and then send it to main process and after it every else accepted.
The trick is to use a reference as a token for replication:
replicate() ->
{ok, Token} = db:ask_replicate(...),
receive
{replication_completed, Token} ->
ok
end
where Token is created with a call to make_ref(). Since no other message will match Token, you are safe. Other messages will be placed in the mailbox for later scrutiny.
However, the above solution does not take process crashes into account. You need a monitor on the DB server as well. The simplest way to get the pattern right is to let the mediator be a gen_server. Alternatively, you can read a chapter in LearnYouSomeErlang: http://learnyousomeerlang.com/what-is-otp#the-basic-server look at the synchronous call in the kitty_server.
In Nitrogen, the Erlang web framework, I have the following problem. I have a process that takes care of sending and receiving messages to another process that acts as a hub. This process acts as the comet process to receive the messages and update the page.
The problem is that when the user process a button I get a call to event. How do I get ahold of that Pid at an event.
the code that initiates the communication and sets up the receiving part looks like this, first I have an event which starts the client process by calling wf:comet:
event(start_chat) ->
Client = wf:comet(fun() -> chat_client() end);
The code for the client process is the following, which gets and joins a room at the beginning and then goes into a loop sending and receiving messages to/from the room:
chat_client() ->
Room = room_provider:get_room(),
room:join(Room),
chat_client(Room).
chat_client(Room) ->
receive
{send_message, Message} ->
room:send_message(Room, Message);
{message, From, Message} ->
wf:insert_bottom(messages, [#p{}, #span { text=Message }]),
wf:comet_flush()
end,
chat_client(Room).
Now, here's the problem. I have another event, send_message:
event(send_message) ->
Message = wf:q(message),
ClientPid ! {send_message, Message}.
except that ClientPid is not defined there, and I can't see how to get ahold of it. Any ideas?
The related threat at the Nitrogen mailing list: http://groups.google.com/group/nitrogenweb/browse_thread/thread/c6d9927467e2a51a
Nitrogen provides a key-value storage per page instance called state. From the documentation:
Retrieve a page state value stored under the specified key. Page State is different from Session State in that Page State is scoped to a series of requests by one user to one Nitrogen Page:
wf:state(Key) -> Value
Store a page state variable for the current user. Page State is different from Session State in that Page State is scoped to a series of requests by one user to one Nitrogen Page:
wf:state(Key, Value) -> ok
Clear a user's page state:
wf:clear_state() -> ok
Have an ets table which maps session id's to client Pid's. Or if nitrogen provides any sort of session management, store the Pid as session data.
Every thing that needs to be remembered needs a process. It looks like your room provider isn't.
room:join(Room) need to be room:join(Room,self()). The room need to know what your comet-process pid is.
To send a message to a client you first send the message to the room, the room will then send a message to all clients in the room. But for that to work. Every client joining the room need to submit the comet-pid. The room need to keep a list of all pid's in the room.