Threaded Erlang C-Node(cnode) Interoperability howto? - erlang

I am at a point in my Erlang development where I need to create a C-Node (see link for C-Node docs). The basic implementation is simple enough, however, there is a huge hole in the doc.
The code implements a single threaded client and server. Ignoring the client for the moment... The 'c' code that implements the server is single threaded and can only connect to one erlang client at a time.
Launch EPMD ('epmd -daemons')
Launch the server application ('cserver 1234')
Launch the erlang client application ('erl -sname e1 -setcookie secretcookie') [in a different window from #2]
execute a server command ('complex3:foo(3).') from the erlang shell in #3
Now that the server is running and that a current erlang shell has connected to the server try it again from another window.
open a new window.
launch an erlang client ('erl -sname e2 -setcookie secretcookie').
execute a new server command ('complex3:foo(3).').
Notice that the system seems hung... when it should have executed the command. The reason it is hung is because the other erlang node is connected and that there are no other threads listening for connections.
NOTE: there seems to be a bug in the connection handling. I added a timeout in the receive block and I caught some errant behavior but I did not get them all. Also, I was able to get the cserver to crash without warnings or errors if I forced the first erlang node to terminate after the indicated steps were performed.
So the question... What is the best way to implement a threaded C-Node? What is a reasonable number of connections?

The cnode implementation example in the cnode tutorial is not meant to handle more than one connected node, so the first symptom you're experiencing is normal.
The erl_accept call is what accepts incoming connections.
if ((fd = erl_accept(listen, &conn)) == ERL_ERROR)
erl_err_quit("erl_accept");
fprintf(stderr, "Connected to %s\n\r", conn.nodename);
while (loop) {
got = erl_receive_msg(fd, buf, BUFSIZE, &emsg);
Note that, written this way, the cnode will accept only one connection and then pass the descriptor to the read/write loop. That's why when the erlang node closes, the cnode ends with an error, since erl_receive_msg will fail because fd will point to a closed socket.
If you want to accept more than one inbound connection, you'll have to loop accepting connections and implement a way to handle more than one file descriptor. You needn't a multithread programme to do so, it would probably be easier (and maybe more efficient) to use the poll or select syscall if your OS supports them.
As for the optimum number of connections, I don't think there is a rule for that, you'd need to benchmark your application if you want to support high concurrency in the cnode. But in that case it would probably be better to re-engineer the system so that erlang copes with the concurrency, alleviating the cnode from that.

Related

Can I call GenServer client functions from a remote node?

I have a GenServer on a remote node with both implementation and client functions in the module. Can I use the GenServer client functions remotely somehow?
Using GenServer.call({RemoteProcessName, :"app#remoteNode"}, :get) works a I expect it to, but is cumbersome.
If I want clean this up am I right in thinking that I'd have to write the client functions on the calling (client) node?
You can use the :rpc.call/{4,5} functions.
:rpc.call(:"app#remoteNode", MyModule, :some_func, [arg1, arg2])
For large number of calls, It's better to user gen_server:call/2-3.
If you want to use rpc:call/4-5, you should know that it is just one process named rex on each node for handling all requests. So if it is running one Mod:Func(Arg1, Arg2, Argn), It can not response to other request at this time !
TL;DR
Yes
Discussion
There are PIDs, messages, monitors and links. Nothing more, nothing less. That is your universe. (Unless you get into some rather esoteric aspects of the runtime implementation -- but at the abstraction level represented by EVM languages the previously stated elements (should) constitute your universe.)
Within an Erlang environment (whether local or distributed in a mesh) any PID can send a message addressed to any other PID (no middle-man required), as well as establish monitors and so on.
gen_server:cast sends a gen_server packaged message (so it will arrive in the form handle_cast/2 will be called on). gen_server:call/2 establishes a monitor and a timeout for receiving a labeled reply. Simply doing PID ! SomeMessage does essentially the same thing as gen_server:cast (sends a message) without any of the gen_server machinery behind it (messier to abstract as an interface).
That's all there is to it.
With this in mind, of course you can use gen_server:call/2 across nodes, as long as they are connected into a cluster/mesh via disterl. Two disconnected nodes would have to communicate a different way (network sockets) and wouldn't have any knowledge of each other's internal mapping of PIDs, but as long as disterl is being used they all translate PIDs amongst themselves quite readily. Named processes is where things get a little tricky, but that is the purpose of the global module and utilities such as gproc (though dependence on such facilities beyond a certain point is usually an indication of an architectural problem).
Of course, just because PIDs from any node can communicate with PIDs from another node doesn't always means they should. The physical topology of the network (bandwidth, latency, jitter) comes into play when you start sending high-frequency or large messages (lots of gen_server:calls), and you have always got to think of partition tolerance -- but for off-loading heavy sorts of work (rare) or physically partitioning sub-systems within a very large system (more common) directly sending messages is a very simple way to take a program coded for a single node and distribute it across a cluster.
(With all that in mind, it is somewhat rare to see the rpc module used.)

Distributed erlang security how to?

I want to have 2 independent erlang nodes that could communicate with each other:
so node a#myhost will be able to send messages to b#myhost.
Are there any ways to restrict node a#myhost, so only a function from a secure_module could be called on b#myhost?
It should be something like:
a#myhost> rpc:call(b#myhost,secure_module,do,[A,B,C]) returns {ok,Result}
and all other calls
a#myhost> rpc:call(b#myhost,Modue,Func,Args) return {error, Reason}
One of the options would be to use ZeroMQ library to establish a communication between nodes, but would it be better if it could be done using some standard Erlang functions/modules?
In this case distributed Erlang is not what you want. Connecting node A to node B makes a single cluster -- one huge, trusted computing environment. You don't want to trust part of this, so you don't want a single cluster.
Instead write a specific network service. Use the network itself as your abstraction layer. The most straightforward way to do this is to establish a stream connection (just boring old gen_tcp, or gen_sctp or use ssl, or whatever) from A to B.
The socket handling process on A receives messages from whatever parts of node A need to call B -- you write this exactly as you would if they were directly connected. Use a normal Erlang messaging style: Message = {name_of_request, Data} or similar. The connecting process on A simply does gen_tcp:send(Socket, term_to_binary(Message)).
The socket handling process on B shuttles received network messages between the socket and your servicing processes by simply receiving {tcp, Socket, Bin} -> Servicer ! binary_to_term(Bin).
Results of computation go back the other direction through the exact same process using the term_to_binary/binary_to_term translation again.
Your service processes should be receiving well defined messages, and disregarding whatever doesn't make sense (usually just logging the nonsense). So in this way you are not doing a direct RPC (which is unsafe in an untrusted environment) you are only responding to valid semantics defined in your (little tiny) messaging protocol. The way the socket handling processes are written is what can abstract this for you and make it feel just as though you are dealing with a trusted environment within distributed Erlang, but actually you have two independent clusters which are limited in what they can request of each other by the definition of your protocol.

How to ensure the receiving application is up and running in client-server communication?

I am presently working on a client-server solution to transfer files to another machine via a socket network connection. Since I intend to do some evaluation on the receiving end as well I am assuming that I will need to have some kind of client or server programme running there, too.
I am fairly new to the whole client-server thing and therefore have the following elementary question:
My present understanding is that client and server will be two independent programmes running on two different machines. How would one typically ensure that the communication partner (i.e., the server when sending from a client and the client when sending from a server) is actually up and running on the remote machine that I want to transfer a file to?
So far, I have been looking into the following options:
In the sending programme include an ssh access to the remote
machine and start an instance of the receiving programme on the
remote machine.
Have the receiving programme run as a demon process on the remote
machine. This would mean that the receiving programme should always
be running on the remote machine. However, how would I know whether
the process has crashed or has been shut down for some reason and
how would one recover from that without option 1) above?
So, my main question is: Are there any additional options that might be worth considering?
Thanks for your view on this!
Depending on how your client server messages are setup, a ping (I don't mean the ICMP ping, but the basic idea) message, where the server can respond with "I am alive" would help. This way at least you know the server end is running.
It is not uncommon in production environments using these that monitoring systems are put in place. Other options worth considering - xinet.d scripts - stuff that gets started on incoming connections.
There probably new ways to achieve the automatic start/restart or start on connection of this with systemd/systemctl but I am not familiar enough with them to give you the specifics.
A somewhat crude, but effective means may be a cron job that periodically runs a script to enforce keeping the service up.

WebSockets in Relation with TCP/IP Sockets on Misultin Erlang HTTP Library

i must say that i am impressed by Misultin's support for Web Sockets (some examples here). My JavaScript is firing requests and getting responses down the wire with "negligible" delay or lag, Great !!
Looking at how the data handler loop for WebSockets looks like, it resembles that of normal TCP/IP Sockets, atleast the basic way in Erlang
% callback on received websockets data
handle_websocket(Ws) ->
receive
{browser, Data} ->
Ws:send(["received '", Data, "'"]),
handle_websocket(Ws);
_Ignore ->
handle_websocket(Ws)
after 5000 ->
Ws:send("pushing!"),
handle_websocket(Ws)
end.
This piece of code is executed in a process which is spawned by Misultin, a function you give to it while starting your server like this below:
start(Port)->
HTTPHandler = fun(Req) -> handle_http(Req, Port) end,
WebSocketHandler = fun(Ws) -> handle_websocket(Ws) end,
Options = [{port, Port},{loop, HTTPHandler},{ws_loop, WebSocketHandler}],
misultin:start_link(Options).
. More Code about this, check out the example page.I have several questions.
Question 1: Can i change the controlling Process of a Web Socket as we normally do with the TCP/IP Sockets in Erlang ? (we normally use: gen_tcp:controlling_process(Socket,NewProcessId))
Question 2: Is Misultin the only Erlang/OTP HTTP library which supports WebSockets ? Where are the rest ?
EDIT :
Now, the reason why i need to be able to transfer the WebSocket control from Misultin
Think of a gen_server that will control a pool of WebSockets, say its a game Server. In the current Misultin Example, for every WebSocket Connection, there is a controlling process, in other-words for every WebSocket, there will be a spawned process. Now, i know Erlang is a hero with Processes but, i do not want this, i want these initial processes to die as soon as they handle over to my gen_server the control authority of the WebSocket.
I would want this gen_server to switch data amongst these WebSockets. In the current implementation, i need to keep track of the Pid of the Misultin handle_websocket process like this:
%% Here is misultin's control process
%% I get its Pid and save it somewhere
%% and link it to my_gen_server so that
%% if it exits i know its gone
handle_websocket(Ws)->
process_flag(trap_exit,true),
Pid = self(),
link(my_gen_server),
save_connection(Pid),
wait_msgs(Ws).
wait_msgs(Ws)->
receive
{browser,Data}->
FromPid = self(),
send_to_gen_server(Data,FromPid),
handle_websocket(Ws);
{broadcast,Message} ->
%% i can broadcast to all connected WebSockets
Ws:send(Message),
handle_websocket(Ws);
_Ignore -> handle_websocket(Ws)
end.
Above, the idea works very well, whereby i save all controlling process into Mnesia Ram Table and look it up against a given criteria if the application wants to send to that particular user a message. However, with what i want to achieve, i realise that in the real-world, the processes may be so many that my server may crash. I want atleast one gen_server to control thousands of the Web Sockets than having a process for each Web Socket, in this way, i could some how conserve memory.
Suggestion: Misultin's Author could create Web Socket Groups implementation for us in his next release, whereby we can have a group of WebSockets controlled by the same process. This would be similar to Nitrogen's Comet Groups in which comet connections are grouped together under the same control. If this aint possible, we will need the control ourselves, provide an API where we can take over the control of these Web Sockets.
What do you Engineers think about this ? What is your suggestion and/or Comment about this ? Misultin's Author could say something about this. Thanks to all
(one) Cowboy developer here.
I wouldn't recommend using any type of central server being responsible for controlling a set of websocket connections. The main reason is that this is a premature optimization, you are only speculating about the memory usage.
A test done earlier last year for half a million websocket connections on a single server resulted in misultin using 20GB of memory, cowboy using 16.2GB or 14.3GB, depending on if the websocket processes were hibernating or not. You can assume that all erlang implementations of websockets are very close to these numbers.
The difference between cowboy not using hibernate and misultin should be pretty close to the memory overhead of using an extra process per connection. (feel free to correct me on this ostinelli).
I am willing to bet that it is much cheaper to take this into account when buying the servers than it is to design and resolve issues in an application where you don't have a 1:1 mapping between tasks/resources and processes.
https://twitter.com/#!/nivertech/status/114460039674212352
Misultin's author here.
I strongly discourage you from changing the controlling process, because that will break all Misultin's internals. Just as Steve suggested, YAWS and Cowboy support WebSockets, and there are implementations done over Mochiweb but I'm not aware of any being actively maintained.
You are discussing about memory concerns, but I think you are mixing concepts. I cannot understand why you do need to control everything 'centrally' from a gen_server: your assumption that 'many processes will crash your VM' is actually wrong, Erlang is built upon the actor's model and this has many advantages:
performance due to multicore usage which is not there if you use a single gen_server
being able to use the 'let it crash' philosophy: currently it looks like your gen_server crashing would bring down all available games
...
Erlang is able to handle hundreds of thousands processes on a single VM, and you'll be out of available file descriptors for your open Sockets way before that happens.
So, I'd suggest you consider having your game logic within individual Websocket processes, and use message passing to make them interact. You may consider spawning 'game processes' which hold information of a single game's participants and status, for instance. Eventually, a gen_server that keeps track of the available games - and does only that (eventually by owning an ETS table). That's the way I'd probably want to go, all with the appropriate supervisors' structure.
Obviously, I'm not sure what you are trying to achieve so I'm just assuming here. But if your concern is memory - well, as TRIAL AND ERROR EXP said here below: don't premature optimize something, especially when you are considering to use Erlang in a way that looks like it might actually limit it from doing what it is capable of.
My $0.02.
Not sure about question 1, but regarding question 2, Yaws and Cowboy also support WebSockets.

Erlang Port Data Transfer Length

I am trying to evaluate php code through erlang using erlang ports. The problem is when Data to be evaluated is bigger then I am getting parse error from php. But if data is smaller then I am getting the correct output. I think when the Data length is bigger erlang is truncating the data before it is being sent to php for evaluation. Is there any limit on data length which can be sent or received on erlang port. Or is this error due to some other reason ?
I am using open_port(PortName, PortSettings) to open a new port and in PortSettings I am setting [{packet,4},exit_status] as my port options.
The {packet, 4} tuple says the program launched to handle the other end of the port expects data in a 4-byte length-prefixed form. I don't see anything in the docs for the php(1) program that says it knows how to deal with such data. Probably the only reason it works for short inputs is that the length prefix looks kinda like ASCII if you squint, as long as the data you're sending is under 127 bytes. As soon as you go over that, PHP is probably running into a UTF-8 decoding error.
I'm pretty sure you want to say spawn here instead. This gets you standard Unix-like pipe interaction: data sent down the port goes to stdin on the launched process, and anything it sends to stdout comes back to your Erlang process.
The only problem doing it this way is that it re-launches php(1) on each transaction. This may seem expensive, but it's not too bad on any Unix type system, due to the relative efficiency of the fork(2) system call. If you're on Windows or you've benchmarked this and found that you really do need to build a FastCGI like system, you may be out of luck. There seems to be no libphp to embed PHP into a program you write to deal with packetized input, and no way to run php(1) in a way that lets it stay active on the other end of a port. You might be better off switching to a native Erlang templating system.
Also, note that the exit_status atom passed to open_port() does nothing unless you use spawn.

Resources