I'm trying to connect to a given number of peers in a P2P network and I'd like to have them span the globe as good as possible (have them as far away from each other as possible). Since I gradually connect to them one after another I will discover better peers one after another and discard (or disconnect) from peers that drop out of my top list. Problem is that I'm stuck defining a metric that tells me how "good" a peer is. I will be using a geo ip database to map ips to geo coordinates but I just can't find a good metric to apply.
In my opinion you should not think of geography but in network topology. For every new peer you should do traceroute and ping so you know how the good the connectivity is. Store this and do cost-calculation. Most simply in just adding up the latency. Bandwidth and latency are more important. If you are concerned with the local laws where the specific peer is located you could start think in geography.
I wonder, are you able to get your users to set a location in their UI? That might be a better measure of geographical spread than ping times. That said, for a variety of complicated reasons - as I'm sure you know - two nodes 500 miles apart may have better speed and latency measurements than two nodes 250 miles apart. So it depends whether you care about location or performance :)
Related
I am new to docker swarm, I read documentation and googled to the topic, but results was vague,
Is it possible to add worker or manager node from distinct and separate Virtual private servers?
Idea is to connect many non-related hosts into a swarm which then creates distribution over many systems and resiliency in case of any HW failures. The only thing you need to watch out for is that the internet connection between the hosts is stable and that all of the needed ports based of the official documentation are open. And you are good to go :)
Oh and between managers you want a VERY stable internet connection without any random ping spikes, or you may encounter weird behaviour (because of consensus with raft and decision making).
other than that it is good
Refer to Administer and maintain a swarm of Docker Engines
In production the best practice to maximise swarm HA is to spread your swarm managers across multiple availability zones. Availability Zones are geo-graphically co-located but distinct sites. i.e. instead of having a single London data centre, have 3 - each connected to a different internet and power utility. That way, if any single ISP or Power utility has an outage, you still have 2 data centres connected to the internet.
Swarm was designed with this kind of Highly available topology in mind and can scale to having its managers - and workers - distributed across nodes in different data centres.
However, Swarm is sensitive to latency over longer distances - so global distribution is not a good idea. In a single city, Data center to Data centre latencies will be in the low 10s of ms. Which is fine.
Connecting data centres in different cities / continents moves the latency to the low, to mid 100s of ms which does cause problems and leads to instability.
Otherwise, go ahead. Build your swarm across AZ distributed nodes.
I am trying volttron for a project solution and want to know the capability of volttron in a long term. The project is to control/monitor ~100k devices, and possibly millions if things run well.
What is the biggest scale of volttron usage in a real scenario? How many devices that one node can accommodate if say that the host machine have high spec?
What is the constrain of volttron later in the future after its use? (constrain as like in database / server resource / network)
The answer hoped to get is not an exact value. I just wanted to find the capability range.
Thanks,
There are several drivers for how well VOLTTRON scales for a single VOLTTRON instance.
In no particular order:
Network and device communication speed. (Are your devices on a serial connection? BACnet devices behind a MSTP router?)
Frequency of data collection. (10 seconds?, 1 minute? 5 minutes? 15 minutes?)
How close together (time wise) does data from differnt devices need to be.
Frequency of commands issued/ number of commands issued.
Machine specs
Often we see the bottleneck being the network for device communication. This will drive the rate at which you can communicate with devices. For collection a mid level PC is overkill in most situations.
In the field our users have been able to scrape 1.5K+ BACnet devices in less than 15 minutes with a single node. Many of these devices were on an MSTP trunk which would be the major limiting factor. If these were TCP BAcnet devices the rate of data acquisition would be much higher.
There are parameters to tune the rate of data collection for a specific node. It is common to tweak these values to find the optimal rate of collection after initial platform configuration.
The kind of scaling you are looking for will require using multiple VOLTTRON instances. It is common to have multiple collection boxes for an installation. Usually these instances will gather data for some number of devices (based on your scenario) and either send those values directly to a database or forward them to another central instance of the platform that will submit the data on the remote nodes behalf. Numbers for some real deployments can be found here: https://volttron.org/sites/default/files/publications/VOLTTRON%20Scalability-update-final.pdf
There are several database options from MySQL to Mongo to SQLite. You will want to pick a central database based on your data collection needs (so not SQLite).
More specifically, at the core of erl, what algorithm is used to understand presence and availability of other nodes? How does it handle network partitioning? Are all the nodes just constantly pinging each other?
For example, if there are two nodes, and the network cable is pulled, how does it decide what to do? Presumably one node should go idle as it's orphaned, while the other carries on, otherwise you get a split-brain behavior..
In reading up on paxos and raft, it seems like it must be doing leader election internally, but I can't seem to find any comprehensible explanation -- I left my PhD in my other pants.. Can anyone explain this voodoo in english?
I've got the same question here and it was answered with this very good article where you can find detailed explanation to your question. The main idea is: Erlang nodes in distributed mode are connected through mesh network. They monitor each other through pings that are done in period of time decided by net_tick_time constant. Pings are used to detect network splits or halted nodes unable to communicate. Other failures such as VM crashes, cable unplugs are detected immediately (within few ms) through underlying network connection.
I'm working on an engineering project where I want a go-kart to maintain a direct connection with a base station. The base and go-kart can be separated by about a half mile (with lots of obstacles in between) which is too far for WiFi.
I'm thinking about using 3G/4G to directly connect the two. Does anyone have any resources or ideas that might help?
Or, alternatively, a better way to connect them? I'm just trying to send some sensor data (pretty low bandwidth) in real-time.
The biggest problem you face is radio spectrum that you are allowed to use. All 3G/4G spectrum is licensed to some firm and they get really unhappy (e.g. have you hunted down and fined) when you transmit in their space.
I did find DASH7 which
is an open source wireless sensor networking standard … which operates in the 433 MHz unlicensed ISM band. DASH7 provides multi-year battery life, range of up to 2 km, indoor location with 1 meter accuracy, low latency for connecting with moving things, a very small open source protocol stack …
with a parts cost around US$ 10. This sounds like it satisfies your requirements and keeps the local constabulary from bothering you.
You could maybe use SMS, between a modem on the kart and a mobile phone or modem at the base.
A mobile data connection like a telephone call isn't possible directly between the two; you have to make a data connection from the kart to a server in your operator's core network, identified by the APN. Then you can access IP addresses as for a normal internet connection - so the base computer would have to be a web server.
Here is my problem:
There are n peers in the P2P network, which request the same data block; And with some constraint.
1. Peers with its own upload bandwidth, and the average bandwidth is the size of the data block.
2. The peers have different deadline about this data block. If one peer didnt get the entire block before the deadline, it has to search for the server help.
3. A peer can transfer data (partial or entire) only if it has the entire data block.
The object is to minimize the server total upload, I cant figure it out if it has an optimal algorithm or it is an NP problem. Deadline first or largest bandwidth first may not deal with some situation
Is there some NP problem similar to this? This is like a graph flow problem or an instruction scheduling, but I found that it is difficult cause I have to deal with the deadline and the growth of the suppliers total bandwidth at the same time.
I hope that I can get some directions or resource about the solution :)
Thanks.
Considering that each peer acts individually in your case, it is not like only one automata is solving your issue, but many. Since fetching a data block when it is not available within a given delay, is typically a polynomial problem, and since the job is accomplished by individual peers, your issue is not an NP problem for each peer locally.
On the other side, if a server has to compute the minimal allocation of backup resources to transfer 'missing blocks', you would have to first find out about the probability that a peer misses a block (average + standard deviation for example). Assuming you know the statistical distribution of such events, you could compute the total bandwidth you would need to transfer those missing blocks with a chosen risk of failure/tolerance in the bandwidth. If you are using multiple servers to cover for the need, make sure your peers contact them randomly to distribute the load.
Solving this statistical problem is not an NP issue. You can collect failure info from each peer and add it on a central/server peer. Therefore, my conclusion is that your issue is not an NP problem.
PART II:
Oh, I understand your case better now: multiple 'server' peers can potentially help one peer getting a full block. In this case, the number of server peers grows exponentially in your system for a given block. In this case, this optimization problem has all the characteristic of a flooding problem for me and they are NP.
Even if your graph of peers and the potential connections between them was static (which is never the case in a real P2P network), computing the optimal solution in a reasonable amount of time for more than 50 or 100 nodes is virtually impossible, unless you can make very specific assumptions on this graph (which is almost never the case in general and not always useful).
But do you absolutely need to have the absolute optimal solution or is something near the optimal good enough?
Heuristics will tell you that if your peers have more or less the same download bandwidth capacity, then it makes sense to serve peers with the highest UPLOAD bandwidth first to maximize the avalanche effect and to reduce the risk for a peer having to ask for help, in general.
If your graph is relatively balanced (that is, most peers can connect to most peers), then I bet the minimum bandwidth of initial servers will be a logarithmic function of the number of nodes in your graph times the average speed at which peers expect to be served. This is only my gut feeling and should be validated with real measures or a strong modeling of your case.