Server-to-Client: RMI and 2 JVM's (Distinct Computers) - communication

I'm using RMI for a while now, and i would like to know:
Is there a way that a RMI-SERVER can notify a RMI-CLIENT that his avaiable (online), passing his ip/port that should be used in Naming.list() in client's code?

Not within RMI. You could look into UDP multicast, or the JNI Discovery and Lookup Service, which is just a massive layer over the top (in all senses) of multicast.

Related

What is the most efficient way to stream data between Docker containers

I have a large number of bytes per second coming from a sensor device (e.g., video) that are being read and processed by a process in a Docker container.
I have a second Docker container that would like to read the processed byte stream (still a large number of bytes per second).
What is an efficient way to read this stream? Ideally I'd like to have the first container write to some sort of shared memory buffer that the second container can read from, but I don't think separate Docker containers can share memory. Perhaps there is some solution with a shared file pointer, with the file saved to an in-memory file system?
My goal is to maximize performance and minimize useless copies of data from one buffer to another as much as possible.
Edit: Would love to have solutions for both Linux and Windows. Similarly, I'm interested in finding solutions for doing this in C++ as well as python.
Create a fifo with mkfifo /tmp/myfifo. Share it with both containers: --volume /tmp/myfifo:/tmp/myfifo:rw
You can directly use it:
From container 1: echo foo >>/tmp/myfifo
In Container 2: read var </tmp/myfifo
Drawback: Container 1 is blocked until Container 2 reads the data and empties the buffer.
Avoid the blocking: In both containers, run in bash exec 3<>/tmp/myfifo.
From container 1: echo foo >&3
In Container 2: read var <&3 (or e.g. cat <&3)
This solution uses exec file descriptor handling from bash. I don't know how, but certainly it is possible with other languages, too.
Using simple TCP socket would be my first choice. Only if measurements show that we absolutely need to squeeze the last bit of performance from the system that I would fall back to or pipes or shared memory.
Going by the problem statement, the process seems to be bound by the local CPU/mem resources and that the limiting factors are not external services. In that case having both producer and consumer on the same machine (as docker containers) might bound the CPU resource before anything else - BUT I will first measure before acting.
Most of the effort in developing a code is spent in maintaining it. So I favor mainstream practices. TCP stack has rock solid foundations and it is as optimized for performance as humanly possible. Also it is lot more (completely?) portable across platforms and frameworks. Docker containers on same host when communicating over TCP do not hit wire. If some day the processes do hit resource limit, you can scale horizontally by splitting the producer and consumer across physical hosts - manually or say using Kubernetes. TCP will work seamlessly in that case. If you never gonna need that level of throughput, then you also wont need system-level sophistication in inter process communication.
Go by TCP.

Docker Swarm - Route a request to ALL containers

Is there any sort of way to broadcast an incoming request to all containers in a swarm?
EDIT: More info
I have a distributed application with many docker containers. The client can send requests to the swarm and have it respond. However, in some cases, the client needs to change a state on all server instances and therefore I would either need to be able to broadcast a message or have all the Docker containers talk to each other similar to MPI, which I'm trying to avoid.
There is no built-in way to turn a unicast packet into a multicast packet, nor any common 3rd party way of doing (That I've seen or heard of).
I'm not sure what "change a state on all server instances" means. Are we talking about the running state on all containers in a single service?
Or the actual underlying OS? All containers on all services? etc.
Without knowing more about your use case, I'd say it's likely better to design something where the request is received by one Swarm service, and then it's stored in a queue system where a backend worker would pick it up and "change the state on all server instances" for you.
It depends on your specific use case. One way to do it is to send a docker service update --force, which will cause all containers to reboot. If your containers fetch the information that is changed at startup, it would have the required effect

How to keep track of children processes in erlang?

I have a static list of "hosts" with their info, and a dynamic list of "host agents". Each host has one and only one agent for as long as it connects to the server by a TCP connection. As the host may or may not be connected, its agent process may or may not be started. When a TCP packet arrives with the host ID, I need to find out if the "agent" of this host is started or not.
Connection is responsible for receive and send data from tcp socket, parse the data to find out which host it should send to and deliver to it's host agent to handle.
Host kept host informations. Host agent handle incoming data, save host information to host and decide what to send in what format(e.g. ack to client with host id and response code).
And in the data packet, it specified source host and target host, which means it sent by source host and should received by target host. In this case target host could be connected in another connection. That's why a need a global map for all connections for the convenience of get the target host agent pid.
I have a supervision tree in which host_supervisor monitors all the host, and connection_supervisor monitors each connection, host_agent_supervisor monitors agent. host_supervisor, connection_supervisor are all supervised by application supervisor which means they are first level children in supervision tree. But host_agent_supervisor is under connection_supervisor.
Questions:
Is it a good idea to store a map into db with host_id and
host_agent_pid pair?
If 1. is true, how to update the host_agent_pid
when something wrong and agent is been restarted?
Is there any better idea to implement this case? It seems my solution does not follow "the erlang way".
The simple, or quick answer to your question(s) are:
It's fine, though besides a map you could also use gb_trees, dict or an ETS table (maps is the least mature of all these of course). However, that notwithstanding, a key/ID to PID lookup table is fine, in principal. ETS might allow a performance benefit over the others because you can create an ETS table that can be accessed from other processes, eliminating the necessity for a single process to do all the reading and writing. That might or might not be important and/or appropriate.
One simple way to do this is every time a "host agent" starts, it spawns another process, which does nothing but link to the "host agent" and remove the host ID to agent PID mapping from whatever store you have when the "host agent" dies. Another way to do it is cause a mapping store process itself to link to your host agent PIDs, which might give you less concern for possible race conditions.
Possibly. When I read your question I was left with certain questions and a general feeling that the solution I would choose wouldn't lead me to the precise lookup issue you are asking about (i.e. lookup of the PID of a "host agent" upon receipt of a TCP packet), but I can't be sure this isn't because you've worked to minimise your question for Stack Overflow. It's a little unclear to me exactly what the roles, responsibilities and interactions of your "host", "host_agent" and "connection" processes really are, and if they should all exist and/or have separate supervision trees.
So, looking at possible alternatives... When you say "when a TCP packet arrives" I assume you mean when a foreign host connects to a listening socket or sends some data on an existing socket already accepted, and that the host ID is either the hostname (and or port) or it is some other arbitrary ID that the foreign host sends to you after connecting.
Either way... Generally in this sort of scenario, I'd expect that a new process (the "host agent" by the sounds of it in your case) would be spawned to handle the newly established TCP connection (via a dynamic (e.g. simple one to one) supervisor), taking ownership of the socket that is the server side end point of that connection; reading and writing the socket as appropriate, and terminating when the connection is closed.
With that model your "host agent" should always be started if there is a connection already and always be NOT started if there is not a connection, and any incoming TCP packet will end up automatically in the hands of the correct agent, because it will be delivered to the socket that the agent is handling, or if it's a new connection, the agent will be started.
The need to lookup the PID of an agent upon receipt of a TCP packet now never arises.
If you need to lookup the PID of an agent for other reasons though, because say your server sometimes needs to pro actively send data to a possibly connected "host", then you either have to get a list of all the supervised "host agents" and pick out the right one (for this you would use supervisor:which_children/1, as per Hamidreza's answer) OR you would maintain a map of host IDs to PIDs, using map, gb_trees, dict, ets, etc. Which is correct depends on how many "hosts" you could have - if it's more than a handful then you should proabably maintain a map of some sort so that the lookup time doesn't become too big.
Final comment, you might consider looking at gproc if you haven't already, in case you consider it of use for your case. It does this sort of thing.
Edit/addition (following question edit):
Your connection process sounds redundant to me; as suggested above, if you give the socket to the host agent then most of the responsibility of the connection is gone. There's no reason the host agent can't parse the data it receives, as far as I can see there's no value in having another process to parse it, just to then pass it to another process. The parsing itself is probably a deterministic function so it is sensible to have a separate module for it, but I see no point in a separate process.
I don't see the point of your 'host' process, you say "Host kept host informations" which makes it sound like it's just a process that holds a hostname or host ID, something like that?
You also say "it specified source host and target host, which means it sent by source host and should received by target host" which is beginning to make this sound a bit like a chat server, or at least some sort of hub spoke / star network style communication protocol. I can't see why you wouldn't be able to do everything you want by creating a supervisor tree like this:
top_sup
|
.------------------------------.
| | |
map_server svc_listener hosts_sup (simple one to one)
|
.----------------------------->
| | | | | |
Here the 'map_server' just maintains a map of host IDs to PIDs of hosts, the svc_listener has the listening socket, and just accepts connections and asks hosts_sup to spawn a new host when a new client connects, and the host processes (under hosts_sup) take responsibility for the accepted socket, and register the host ID and their PID with map_server when they start.
If map_server links to the host PIDs it can automatically clean up when a host dies, and it can provide a suitable API for any process to look up a host PID by host ID.
In order to get a list of child processes of a supervisor, you can use supervisor:which_children/1 API. It gets a reference to your supervisor which can be its registered name or PID, and returns a list of its children.
supervisor:which_children(SupRef) -> [{Id, Child, Type, Modules}]

how to listen to a broadcast on -any- port within a linux distributed system

I am working on a distributed application in which a set of logical nodes communicate with each other.
In the initial discovery phase, each logical node starts up and sends out a UDP broadcast packet to the network to inform the rest of the nodes of its existence.
With different physical hosts, this can easily be handled by agreeing on a port number and keeping track of UDP broadcasts received from other hosts.
My problem is - I need to be able to be able to handle the case of multiple logical nodes on the same machine as well.
So in this case, it seems I cannot bind to the same port twice. How do I handle the node discovery case if there are two logical nodes on the same box ?? Thanks a lot in advance !!
Your choices are:
Create a RAW socket and listen to all packets on a particular NIC, this way ,by looking at the content of each packet, the process will identify if the packet is for destined for itself. The problem with this is tons of packets you would have to process. This is why kernels of our operating systems bind sockets to processes, so the traffic gets distributed optimally.
Create a specialized service, i.e. a daemon that will handle announcements of new processes that will be available to execute the work. When launched, the process will have to announce its port number to the service. This is usually how it is done.
Use virtual IP addresses for each process you want to run, each process binds to different IP address. If you are running on a local network, this is the simplest way.
Define range of ports and scan this range on all ip addresses you have defined.

Building a Network Appliance Prototype Using a standard PC with Linux and Two NIC's

I am willing to build a prototype of network appliance.
This appliance is suppose to transparently manipulate Ethernet packets. It suppose to have two network interface cards having one card connected to the outside leg (i.e. eth0) and the other to the inside leg (i.e. eth1).
In a typical network layout as in the attached image, it will be placed between the router and the LAN's switch.
My plans are to write a software that hooks at the kernel driver level and do whatever I need to do to incoming and outgoing packets.
For instance, an "outgoing" packet (at eth1) would be manipulated and passed over to the other NIC (eth0) which then should be transported over to the next hope
My questions are:
Is this doable?
Those NIC's will have no IP address, is that should be a problem?
Thanks in advance for your answers.
(And no, there is no such device yet in the market, so please, "why reinvent the wheel" style of answers are irrelevant)
typical network diagram http://img163.imageshack.us/img163/1249/stackpost.png
I'd suggest libipq, which seems to do just what you want:
Netfilter provides a mechanism for passing packets out of the stack for queueing to userspace, then receiving these packets back into the kernel with a verdict specifying what to do with the packets (such as ACCEPT or DROP). These packets may also be modified in userspace prior to reinjection back into the kernel.
Apparently, it can be done.
I am actually trying to build a prototype of it using scapy
as long as the NICs are set to promiscous mode, they catch packets on the network without the need of an IP address set on them. I know it can be done as there are a lot of companies that produce the same type of equipment (I.E: Juniper Networks, Cisco, F5, Fortinet ect.)

Resources