When one system is connected to many other systems in a network, what are the real-time issues that occurs? - graph-algorithm

I just want to know the issues or threats which an individual can have when his/ her machine is in network. And is it possible for BFS approach to find which machine is responsible for the threat?

Related

add nodes to dockers swarm from different servers

I am new to docker swarm, I read documentation and googled to the topic, but results was vague,
Is it possible to add worker or manager node from distinct and separate Virtual private servers?
Idea is to connect many non-related hosts into a swarm which then creates distribution over many systems and resiliency in case of any HW failures. The only thing you need to watch out for is that the internet connection between the hosts is stable and that all of the needed ports based of the official documentation are open. And you are good to go :)
Oh and between managers you want a VERY stable internet connection without any random ping spikes, or you may encounter weird behaviour (because of consensus with raft and decision making).
other than that it is good
Refer to Administer and maintain a swarm of Docker Engines
In production the best practice to maximise swarm HA is to spread your swarm managers across multiple availability zones. Availability Zones are geo-graphically co-located but distinct sites. i.e. instead of having a single London data centre, have 3 - each connected to a different internet and power utility. That way, if any single ISP or Power utility has an outage, you still have 2 data centres connected to the internet.
Swarm was designed with this kind of Highly available topology in mind and can scale to having its managers - and workers - distributed across nodes in different data centres.
However, Swarm is sensitive to latency over longer distances - so global distribution is not a good idea. In a single city, Data center to Data centre latencies will be in the low 10s of ms. Which is fine.
Connecting data centres in different cities / continents moves the latency to the low, to mid 100s of ms which does cause problems and leads to instability.
Otherwise, go ahead. Build your swarm across AZ distributed nodes.

Guidance on when to chose virtual machines or physical machines over containers

There are many articles and videos comparing containers, virtual machines, physical machines. However almost all information is theoretical: containers are fast, VMs are secure, etc. But I could not find description of specific use cases or guidance on when to choose virtual machines, physical machines, but not containers. So, currently I cannot imagine situation when somebody gives recommendation to not use containers.
Question:
Could you please list specific applications or solutions when you would recommend using VMs, but not container?
Could you please list specific applications or solutions when you would recommend using OS over bare metal, but not containers or VMs?
Here is example of answer I would appreciate to get (note, that I am not sure if this information is correct):
Use case 1: Edge Router
Edge router is a router which connects organizational network to the Internet. Also, in this case it is assumed, that vendor of the router provides it not as device but as a software package (virtualized router).
Edge router most probably will be one of target of hacker's attacks. Thus security requirements come to the first place.
Containers are not recommended in this case. By default containers provide mediocre level of security. Strong security can be achieved with complex configuration (what configuration?) but this is more difficult than in case of VM or bare metal. In addition, high security level may require special hardened Linux kernel, however containers technology does not allow adjusting kernel configuration.
Virtual Machines would be a good choice if vendor of the router provides software as VM image or when organization has many edge routers (for example, many offices with internet access points), and has (or is ready to create) well-established process of preparation of VM images. In this case using VMs will simplify rollout, update and healing the virtualized edge router. VM also provides high security level; nevertheless is it still recommended to place such a VM in a separate server and to not share same server with other applications/VMs to avoid cross-VM attacks.
Physical machine would be a good choice if router vendor provides router's software as an application package (not as a VM) such as .rpm, and rollout, update and healing processes are not expected to take much efforts; this might be the case when when company has few routers (so updates can be performed manually or automated with tools like Ansible), and couple of hour of planned and unplanned downtime is acceptable.
Use case 2: ...
Thank you in advance.
The question is a bit vague so I'll try my best:
you'd usually allocate work to containers when you have a few separate applications with limited physical resources and you'd like to run them each with their own different environment (different runtime version, architecture and dependencies) which managing on a machine (physical or virtual) would be cumbersome.
you'd use a VM when you want specifically a feature that containers couldn't satisfy or it would just be a headache to set them up with it and a simple quick and easy VM could solve (and again you have limited resources you'd like to share between use cases)
and finally, a physical machine when performance is of the essence like I/O requests and latency around that.
you can also mix and match to match each tier needs:
we need to run many applications that VM would be too much of an overhead for them and containers would make their handling more automated and streamline so containers with k8s, but on the other hand, we want local storage offered to those containers to be very fast so we run the k8s cluster on physical machines.
if recoverability would be of the essence we would have used VM due to the options of snapshotting VM states over time.
It's all a big LEGO set you can mix and match depending on your use case and needs

How to discover the high-performance network interface on a linux HPC cluster?

I have a distributed program which communicates with ZeroMQ that runs on HPC clusters.
ZeroMQ uses TCP sockets, so by default on HPC clusters the communications will use the admin network, so I have introduced an environment variable read by my code to force communication on a particular network interface.
With Infiniband (IB), usually it is ib0. But there are cases where another IB interface is used for the parallel file system, or on Cray systems the interface is ipogif, on some non-HPC systems it can be eth1, eno1, p4p2, em2, enp96s0f0, or whatever...
The problem is that I need to ask the administrator of the cluster the name of the network interface to use, while codes using MPI don't need to because MPI "knows" which network to use.
What is the most portable way to discover the name of the high-performance network interface on a linux HPC cluster? (I don't mind writing a small MPI program for this if there is no simple way)
There is no simple way and I doubt a complete solution exists. For example, Open MPI comes with an extensive set of ranked network communication modules and tries to instantiate all of them, selecting in the end the one that has the highest rank. The idea is that ranks somehow reflect the speed of the underlying network and that if a given network type is not present, its module will fail to instantiate, so faced with a system that has both Ethernet and InfiniBand, it will pick InfiniBand as its module has higher precedence. This is why larger Open MPI jobs start relatively slowly and is definitely not fool proof - in some cases one has to intervene and manually select the right modules, especially if the node has several network interfaces of InfiniBand HCAs and not all of them provide node-to-node connectivity. This is usually configured system-wide by the system administrator or the vendor and is why MPI "just works" (pro tip: in not-so-small number of cases it actually doesn't).
You may copy the approach taken by Open MPI and develop a set of detection modules for your program. For TCP, spawn two or more copies on different nodes, list their active network interfaces and the corresponding IP addresses, match the network addresses and bind on all interfaces on one node, then try to connect to it from the other node(s). Upon successful connection, run something like the TCP version of NetPIPE to measure the network speed and latency and pick the fastest network. Once you've gotten this information from the initial small set of nodes, it is very likely that the same interface is used on all other nodes too, since most HPC systems are as homogeneous as possible when it comes to their nodes' network configuration.
If there is a working MPI implementation installed, you can use it to launch the test program. You may also enable debug logging in the MPI library and parse the output, but this will require that the target system has an MPI implementation supported by your log parser. Also, most MPI libraries use native InfiniBand or whatever high-speed network API there is and will not tell you which is the IP-over-whatever interface, because they won't use it at all (unless configured otherwise by the system administrator).
Q : What is the most portable way to discover the name of the high-performance network interface on a linux HPC cluster?
This seems to be in a gray-zone - trying to solve a multi-faceted problem among site-specific hardware (technical) interface naming and theirs non-technical, weakly administratively maintained, preferred ways of use.
As-is State :
ZeroMQ can (as per RFC 37/ZMTP v3.0+) specify <hardware(interface)>:<port>/<service> details :
zmq_bind (server_socket, "tcp://eth0:6000/system/name-service/test");
And:
zmq_connect (client_socket, "tcp://192.168.55.212:6000/system/name-service/test");
yet has no means, to my knowledge, to reverse-engineer the primary use of such an interface, in the holistic context of the HPC-site and it's hardware configuration.
Seems to me, your idea of pre-testing the administrative mappings via MPI-tool first and letting ZeroMQ deployment use these externally detected (if indeed auto-detectable, as you assumed above) configuration details for a proper (preferred) interface usage.
The Safe Way to Go :
Asking the HPC-infrastructure Support Team ( who is responsible for knowing all of the above and trained to help Scientific Teams to use the HPC in the most productive manner ) would be my preferred way to go.
Disclaimer :
Sorry in case this did not help your will to read & auto-detect all the needed configuration details ( a universal BlackBox-HPC-ecosystem detection and auto-configuration strategy would hardly be a trivial one-liner, I guess, wouldn't it? )

How scalable is distributed Erlang?

Part A:
Erlang has a lot of success stories about running concurrent agents e.g. the millions of simultaneous Facebook chats. That's millions of agents, but of course it's not millions of CPUs across a network. I'm having trouble finding metrics on how well Erlang scales when scaling is "horizontal" across a LAN/WAN.
Let's assume that I have many (tens of thousands) physical nodes (running Erlang on Linux) that need to communicate and synchronize small infrequent amounts of data across the LAN/WAN. At what point will I have communications bottlenecks, not between agents, but between physical nodes? (Or will this just work, assuming a stable network?)
Part B:
I understand (as an Erlang newbie, meaning I could be totally wrong) that Erlang nodes attempt to all connect to and be aware of each other, resulting in an N^2 connection point-to-point network. Assuming that part A won't just work with N = 10K's, can Erlang be configured easily (using out-of-the-box config or trivial boilerplate, not writing a full implementation of grouping/routing algorithms myself) to cluster nodes into manageable groups and route system -wide messages through the cluster/group hierarchy?
We should specify that we talk about horizontal scalability of physical machines -- that's the only problem. CPUs on one machine will be handled by one VM, no matter what the number of those is.
node = machine.
To begin, I can say that 30-60 nodes you get out of the box (vanilla OTP installation) with any custom application written on the top of that (in Erlang). Proof: ejabberd.
~100-150 is possible with optimized custom application. I means, it has to be good code, written with knowledge about GC, characteristic of data types, message passing etc.
over +150 is all right but when we talk about numbers like 300, 500 it will require optimizations & customizations of TCP layer. Also, our app has to be aware of cost of e.g. sync calls across the cluster.
The other thing is DB layer. Mnesia (built-in) due its features will not be effective over 20 nodes (my experience - I may be wrong). Solution: just use something else: dynamo DBs, separate cluster of MySQLs, HBase etc.
The most common technique to leverage cost of creating high quality application and scalability are federations of ~20-50 nodes clusters. So internally its an efficient mesh of ~50 erlang nodes and its connected via any suitable protocol with N another 50 nodes clusters. To sum up, such a system is federation of N erlang clusters.
Distributed erlang is designed to run in one data center. If you need more, geographically distant nodes, then use federations.
There are lots of config options e.g. which do not connect all nodes to each other. It may be helpful, however in ~50 cluster erlang overhead is not significant. Also you can create a graph of erlang nodes using 'hidden' connection, which doesn't join this full mesh, but also it cannot benefit from connection to all nodes.
The biggest problem I see, in this kind of systems, is designing it as master-less system. If you do not need that, everything should be ok.

Deliver multicast to several different geo-locations

I need to use one logical PGM based multicast address in application while enable such application "seamlessly" running across several different geo-locations (i.e. think US/Europe/Australia).
Application is quite throughput (several million biz. messages a day) and latency demanding whith a lot of small but very frequently send messages. Classical Atom pub will not work here due some external limits of latencies.
I have come up with several options to connect those datacenters but can’t find the best one.
Options which I have considered are:
1) Forward multicast messages via VPN’s (can VPN handle such big load).
2) Translate all multicast messages to “wrapper messages” and forward them via AMQP.
3) Write specialized in-house gate which tunnels multicast messages via TCP to other two locations.
4) Any other solution
I would prefer option 1 as it does not need additional code writes from devs. but I’m afraid it will not be reliable connection.
Are there any rules to apply for such connectivity?
What the best network configuration with regard to the geographical configuration is for above constrains.
Just wanted to say hello :)
As for the topic, we have not much experience with multicasting over WAN, however, my feeling is that PGM + WAN + high volume of data would lead to retransmission storms. VPN won't make this problem disappear as all the Australian receivers would, when confronted with missing packets, send NACKS to Europe etc.
PGM specification does allow for tree structure of nodes for message delivery, so in theory you could place a single node on the receiving side that would in its turn re-multicast the data locally. However, I am not sure whether this kind of functionality is available with MS implementation of PGM. Optionally, you can place a Cisco router with PGM support on the receiving side that would handle this for you.
In any case, my preference would be to convert the data to TCP stream, pass it over the WAN and then convert it back to PGM on the other side. Some code has to be written, but no nasty surprises are to be expected.
Martin S.
at CohesiveFT we ran into a very similar problem when we designed our "VPN-Cubed" product for connecting multiple clouds up to servers behind our own firewall, in one VPN. We wanted to be able to run apps that talked to each other using multicast, but for example Amazon EC2 does not support multicast for reasons that should be fairly obvious if you consider the potential for network storms across a whole data center. We also wanted to route traffic across a wide area federation of nodes using the internet.
Without going into too much detail, the solution involved combining tunneling with standard routing protocols like BGP, and open technologies for VPNs. We used RabbitMQ AMQP to deliver messages in a pubsub style without needing physical multicast. This means you can fake multicast over wide area subnets, even across domains and firewalls, provided you are in the VPN-Cubed safe harbour. It works because it is a 'network overlay' as described in technical note here: http://blog.elasticserver.com/2008/12/vpn-cubed-technical-overview.html
I don't intend to actually offer you a specific solution, but I do hope this answer gives you confidence to try some of these approaches.
Cheers, alexis

Resources