scale capability of volttron - capability

I am trying volttron for a project solution and want to know the capability of volttron in a long term. The project is to control/monitor ~100k devices, and possibly millions if things run well.
What is the biggest scale of volttron usage in a real scenario? How many devices that one node can accommodate if say that the host machine have high spec?
What is the constrain of volttron later in the future after its use? (constrain as like in database / server resource / network)
The answer hoped to get is not an exact value. I just wanted to find the capability range.
Thanks,

There are several drivers for how well VOLTTRON scales for a single VOLTTRON instance.
In no particular order:
Network and device communication speed. (Are your devices on a serial connection? BACnet devices behind a MSTP router?)
Frequency of data collection. (10 seconds?, 1 minute? 5 minutes? 15 minutes?)
How close together (time wise) does data from differnt devices need to be.
Frequency of commands issued/ number of commands issued.
Machine specs
Often we see the bottleneck being the network for device communication. This will drive the rate at which you can communicate with devices. For collection a mid level PC is overkill in most situations.
In the field our users have been able to scrape 1.5K+ BACnet devices in less than 15 minutes with a single node. Many of these devices were on an MSTP trunk which would be the major limiting factor. If these were TCP BAcnet devices the rate of data acquisition would be much higher.
There are parameters to tune the rate of data collection for a specific node. It is common to tweak these values to find the optimal rate of collection after initial platform configuration.
The kind of scaling you are looking for will require using multiple VOLTTRON instances. It is common to have multiple collection boxes for an installation. Usually these instances will gather data for some number of devices (based on your scenario) and either send those values directly to a database or forward them to another central instance of the platform that will submit the data on the remote nodes behalf. Numbers for some real deployments can be found here: https://volttron.org/sites/default/files/publications/VOLTTRON%20Scalability-update-final.pdf
There are several database options from MySQL to Mongo to SQLite. You will want to pick a central database based on your data collection needs (so not SQLite).

Related

Using AWS IOT to communicate between multiple IPad type devices

I am looking at a project that is trying to connect a number of IPad type devices, so each can pass information to the other members of this group.
For example, given six devices, if Dev1 moves to a new position, that position information should be propagated to Dev2-6. Similarly, if Dev2 moves, the information is passed to Dev1, Dev3-6.
It would seem to me that this might be doable by
Having a Message Broker with a MQTT topic to which all devices subscribe.
Writing the information to the broker
Then reading, or better still, pushing the data to other devices (perhaps using shadow devices).
Is this a reasonable approach? (I don't quite see how to write to the message broker).
Could binary data be sent?
Any ideas on how it would scale?

What's the real minimum RAM for running an instance of Neo4J

I want to run as many instances of Neo4J (using the Enterprise version) on a single VM as possible. What are the real minimum RAM requirements to fire up an instance?
Right now TaskManager is telling me that Java.exe is taking about 70,000K (70 Meg). Does that sound right?
I'm not worried about the performance, I just want to stuff as many instances as possible on a single box so people can do some low demand search of their graph.
One thing is what is recommended and second "it depends".
Neo4j is able to run on the Raspberry Pi. But you shouldn't expect great performance. Also I'm using AWS t2.micro for testing and it's enough.
The size is bound to fluctuate as the graph is loaded into memory to perform traversals and when it is paged back to the disk (When memory is running out).
If I may offer up a suggestion, you could run only one database instance and have unconnected graphs for each of your users. This would very likely be far more efficient in terms of server resources.
For example, If you have say (:Item) nodes which make up a graph for each user,
you could have them instead label them as (:Item-User1) with a unique prefix or postfix for each user.
Thus when you want to alter the query to run for each user you could just add that unique element and search the graph.
The Idea is to have a separate sub-graph for each user which is unconnected to other user's sub-graphs. Instead of having a separate database instance for each Individual user. As long as each user's sub-graph is unconnected from other user's sub-graphs there should be no security vulnerability where a user is given access to another user's data.
This way you could potentially have infinite number of users (within reason. quite possibly in the millions of users) each with their own sub graphs, with potentially no loss in performance, instead of the handful of database instances you could spin up on a single VM, which are likely to be competing for resources and choking out.
For neo4j 3.x, the documented minimum memory requirement is 2GB.

What is the recommended hardware for the following neo4j setup?

I need to build and analyze a complex network using neo4j and would like to know what is the recommended hardware for the following setup:
There are three types of nodes.
There are three types of relationships.
At the steady state, the network will contain about 1M nodes of each type and about the same amount of edges
Every day, about 500K relationships are updated and 100K nodes and edges are added. Approximately the same amount of nodes/edges are also removed.
Network update will be done in daily batches and we can tolerate update times of 1-2 hours
Once the system is up, we will quire the database for shortest paths between different nodes. Not more than 500K times per day. We can live with batch query.
Most probably, I'll use REST API
I think you should take a look at Neo4j Hardware requirements.
For the server you're talking about, I think the first thing needed will obviously be a large bandwidth. If your requests are done in a short time, it'll be needed.
Apart from that, a "normal" server should be enough :
8 or more cores
At least 24Go ram
At least 1To SSD storage (this one is important and expensive)
A good bandwidth (like 1Gbps)
By the way, it's not a programming question, so I think you should have asked this to Neo4j.
You can use Neo4j Hardware sizing calculator for rough estimation of the HW needs.

Fetch data subset from gmond

This is in the context of a small data-center setup where the number of servers to be monitored are only in double-digits and may grow only slowly to few hundreds (if at all). I am a ganglia newbie and have just completed setting up a small ganglia test bed (and have been reading and playing with it). The couple of things I realise -
gmetad supports interactive queries on port 8652 using which I can get metric data subsets - say data of particular metric family in a specific cluster
gmond seems to always return the whole dump of data for all metrics from all nodes in a cluster (on doing 'netcat host 8649')
In my setup, I dont want to use gmetad or RRD. I want to directly fetch data from the multiple gmond clusters and store it in a single data-store. There are couple of reasons to not use gmetad and RRD -
I dont want multiple data-stores in the whole setup. I can have one dedicated machine to fetch data from the multiple, few clusters and store them
I dont plan to use gweb as the data front end. The data from ganglia will be fed into a different monitoring tool altogether. With this setup, I want to eliminate the latency that another layer of gmetad could add. That is, gmetad polls say every minute and my management tool polls gmetad every minute will add 2 minutes delay which I feel is unnecessary for a relatively small/medium sized setup
There are couple of problems in the approach for which I need help -
I cannot get filtered data from gmond. Is there some plugin that can help me fetch individual metric/metric-group information from gmond (since different metrics are collected in different intervals)
gmond output is very verbose text. Is there some other (hopefully binary) format that I can configure for export?
Is my idea of eliminating gmetad/RRD completely a very bad idea? Has anyone tried this approach before? What should I be careful of, in doing so from a data collection standpoint.
Thanks in advance.

Erlang fault-tolerant application: PA or CA of CAP?

I have already asked a question regarding a simple fault-tolerant soft real-time web application for a pizza delivery shop.
I have gotten really nice comments and answers there, but I disagree in that it is a true web service. Rather than a web service, it is more of a real-time system to accept orders from customers, control the dispatching of these orders and control the vehicles that deliver those orders in real time.
Moreover, unlike a 'true' web service this system is not intended to have many users - it is just a few dispatchers (telephone operators) and a few delivery drivers that will use it (as for now I have no requirement to provide direct access to the service to the actual customers; only the dispatchers and delivery drivers will have the direct access).
Hence this question is a bit more general.
I have found that in order to make a right choice for a NoSQL data storage option for this application first thing that I have to do is to make a choice between CA, PA and CP according to the CAP theorem.
Now, the Building Web Applications with Erlang book says that "while it [Mnesia] is not a SQL database, it is a CA database like a SQL database. It will not handle network partition". The same book says that the CouchDB database is a PA database.
Having that in mind, I think that the very first thing that I need to do with my application is to decide what the 'fault-tolerance' term means regarding to CAP.
The simple requirement that I have is to have the application available 24/7(R1). The other one is that there is no need to scale, the application will have a very modest amount of users (it is probably not possible to have thousands of dispatchers) (R2).
Now, does R1 require the application to provide Consistency, Availability and Partition Tolerance and with what priorities?
What type of data storage option will better handle the following issues:
Providing 24/7 availability for a dispatcher (a person who accepts phone calls from customers and who uses a CRM) to look up customer records and put orders into the system;
Looking up current ongoing served orders and their status (placed, baking, dispatched, delivering, delivered) in real time;
Keep track of all working vehicles' locations and their payloads in real time;
Recover any part of the system after system crash or network crash to continue providing 1,2 and 3;
To sum it up: What kind of Data Storage (CA, PA or CP) will suite the system described above better? What kind of Data Storage will better satisfy the R1 requirement?
For your 24/ requirement you are searching a database with (High) Availability because you want your requests to succeed everytime (even if they are only error results).
A netsplit would bringt your whole system down, when you have no partition tolerance
Consistency is nice to have, but you can only have 2 of 3.
Your best bet will be a PA solution. I highly recomment a solution which has been inspired by Amazon Dynamo. The best known dynamo implementations are riak and couchdb. Riak even allows you to change PA to some other form by tuning the read and write replicas.
First, don't confuse CAP "Availability" with "High Availability". They have nothing to do with each other. The A in CAP simply means "All DB nodes can answer queries". To get High Availability, you must be in multiple data centers, you must have robust documented procedures for maintenance, expansion, etc. None of that depends on your CAP choice.
Second, be realistic about your requirements. A stock-trading application might have a requirement for 100% uptime, because every second of downtime could loose millions of dollars. On the other hand, I'm guessing your pizza joint might loose tens of dollars for every minute it's down. So it doesn't make sense to spend millions trying to keep it up. Try to compute your actual costs.
Third, always evaluate your choice vs mainstream. You could just go CA (MySQL) and quickly fail-over to the slaves when problems happen. Be realistic about the costs (and risks) of building on new technology. If you really expect your system to run for 5 years without downtime, ask for proof that someone else has run that database for 5 years without downtime.
If you go "AP" and have remote people (drivers, etc.) then you'll need to write an app that stores their data on their phone and sends it in the background (with retries). Of course, you could do this regardless of weather your database was CA or AP.
If you want high uptimes, you can either:
Increase MTBF (Mean Time Between Failures) - Buy redundant power supplies, buy dual ethernet cards, etc..
Decrease MTTR (Mean Time To Recovery) - Just make sure when failure happens you can recover quickly. (Fail over to slave)
I've seen people spend tens of thousands of dollars on MTBF, only to be down for 8 hours while they restore their backup. It makes more sense to ensure MTTR is low before attacking MTBF.

Resources