Difference between using Thingsboard Edge and Thingsboard CE on premise - thingsboard

After checking the documentation of Thingsboard and all its flavours, one question I have is, Does it make sense to use Thingsboard Edge when the Thingsboard CE/PE server is on-premise?
What I understood is that the point of the TB Edge is to do local aggregation/data processing, lower latency in data handling and visualisation, react to local alerts, and reduce data traffic.
However, some of these concerns are not critical when the main TB server is in the same local network and traffic and storage are not charged by a 3rd party.
What could be an example of such a case if any?
Thanks.

Related

How much impact does the network delay have on IoT Edge throughput?

We have a customer who has deployed a number of iotedge transparent gateways and keeps routing data from tons of leaf devices to cloud.
Recently they noticed the output (edge to IoT Hub) cannot catch up the input on part of the edge devices, which is causing a severe latency issue for their messages.
Here's the information of the built-in metrics on edgeHub,named 8B:
edgehub_queue_length
8B: 981061
edgehub_message_send_duration_seconds
8B: ~110ms
{quantile="0.1"} 0.0632608
{quantile="0.5"} 0.1136008
{quantile="0.9"} 0.127605
{quantile="0.99"} 0.2449048
edgehub_message_process_duration_seconds
8B: 0.5-2.0 ms
We would like to clarify two questions:
What is the recommended network latency for iotedge gateway?
Are there any other methods we can do to improve the output throughput of
edgeHub?

Vortex fog vortex gateway integration

I am new to Data Distributive Service.I am using PrismTech products for DDS. I have vortex lite in my network. To interact with the vortex gateway in public cloud , i am using vortex fog service. But was not able to establish interaction. Can any one please provide input for the same.
I have a DDS subsystem running on my network , data from it needs to be shared to the vortex gateway running in the cloud, for this purpose i am try to use Vortex fog . IN vortex fog configuration i have mentioned the public ip of the cloud server. I have a vortex gateway subscriber job running in the cloud to receive the dds data from the subsystem running in my network.
Fog service is running in the LAN behind the NAT . I had set below configurations for running fog service
fog.cluster.id=LAN1
fog.user.network.interface=eth1
fog.routing.network.interface=eth1
fog.services.network.interface=eth1
fog.services.tcp.peers=<public ip of cloud server>:7400
fog.externalNetworkAddresses=none
In cloud server i am running the vortex gateway subscribing to different topics.
Could you please correct/guide me to solve this issue
It is hard to give you a concrete answer as I don't have the details of your configuration. That said, let me try to give you some hint that may guide you toward the resolution of your problem.
Are there any applications subscribing for data on the Cloud? Notice that in Vortex data only flows if there is an interest arising. Otherwise no data is sent across the network -- that would just be a waste of precious resources. Beware that even if you have applications sharing data within a Fog but no applications subscribing to data "outside" the Fog, data won't be pushed out by Vortex-Fog. Once again, data flows only where there is an interest.
I assume that you are using Fog because you have an entire sub-system, i.e. several DDS applications, whose data needs to be efficiently shared with the cloud while maintaining multicast communication on the sub-system. If this is not the case, then you can simply configure Lite and the Gateway to use TCP/IP and have them talk directly. That would probably be the simplest deployment.
To ensure that you don't have any specific problems with your network set-up have you tried to run two Lite applications that use TCP/IP and communicate through our public Vortex Cloud instance available at demo-eu.prismtech.com or demo-us.prismtech.com?
If you post your configuration files I may be able to give you more insights.
HTH.
A+

why and when i need mqtt broker for IOT/M2M application

Just asking one silly question, hope someone can answer this.
I'm bit confused regarding MQTT broker. Basically, the confusion is, there are so many things being used for data storing, transfer and processing (like Flume, HDInsight, Spark etc). So, when and why I need to use one MQTT broker?
If I would like to use Windows 10 IoT application with HiveMQ, from where can I get the details? how to use it? How I get benefit out of this MQTT broker? Can I not send data from my IoT application directly using Azure or HDFS? So, how MQTT broker fits into it or helping me to achieve something?
I'm new to all these and tried to find some tutorials, however, I'm not getting anything proper. Please explain it in more details or give some tutorials for this?
MQTT is a client-server protocol for pub-sub based transport that has a comparatively small overhead, and thus applicable to mobile and IoT applications (unlike Flume, etc.). The MQTT broker is basically a server that handles messaging to/from MQTT clients and among them. The functionality pretty much stops at the transport layer, even though various MQTT add-ons exist.
If you are looking to implement a solution that would reliably transfer data from your IoT devices to the back-end system for processing, I would suggest you take a look into Kaa open-source IoT platform. It goes much further than MQTT by providing not only the transport layer, suitable for low-power IoT devices, but also a solid chunk of the application level logic (including the object bindings for your application-level data structures, temporary data persistence, etc.).
Here is a link to a webinar that explains how to build a scalable IoT analytics system with Kaa and Spark in less than an hour.
This is an architectural choice. IoT applications are possible without MQTT but there are some advantages when using MQTT. If you are completely new to MQTT, take a look at this in-depth MQTT series: http://forkbomb-blog.de/2015/all-you-need-to-know-about-mqtt
Basically the main architectural advantage is publish / subscribe designed for low-latency, high throughput (mobile) communication with minimal protocol overhead (which is important if bandwidth is at a premium). You can completely decouple consumers and producers.
HDFS is the (distributed) Hadoop file system and is the foundation for Map / Reduce processing. It is not comparable to a MQTT broker. The MQTT broker could write to the HDFS, though (in case of HiveMQ with a custom plugin).
Basically MQTT is a protocol while the products you are mentioning are, well, products which solve completely different problems:
Flume is basically used for log aggregation at scale. You won't use MQTT for that, at least there is not too much advantage because this is typically done in backend applications.
Spark and Hadoop shine at Big Data crunching. They are a framework and not a ready to use solution. They are not really comparable to MQTT. Often MQTT brokers like HiveMQ are used in conjunction with these, Spark / Hadoop for data processing and HiveMQ for communication.
I hope this helps you getting started. Best would be to read about typical use cases of all these technologies, this is a bit too broad for a single SO answer.
MQTT is a data transport, so the usual thing I have to compare it with is HTTP. HTTP has two important characteristics, a) It goes from one point to another, b) It is request/response, so only one end can start a data transfer. MQTT connects many end points to many end points, and either end can start a data transfer. So, if you have just one device and only one service or person that will ever access it, and only by polling, then HTTP is great. MQTT means many devices can post data to many services or people, AND the other way around. Your question assumes that your data is always going to land up in some sort of data store, but many interactions are about events and responding to them immediately, like ringing a doorbell, or lowering the landing gear. In these cases you will often want to both record the data, and have an immediate action occur, like your phone making a doorbell noise.
Finally, you send data to MQTT semantically, rather than by IP address.
This means that your services subscribes to /mikeshouse/doorbell rather than polling 192.168.22.4, which is a huge gain once you have a number of devices.

Scaling a TCP/IP based system and ensuring high availability

I have a TCP/IP based component which is communicating with a c++ based system. In fact it is reading raw bytes from that system and then marshaling those raw bytes in objects and storing it in the DB. This multi-threaded tcp/ip based component is in java and could be deployed on a dual core or quad core processor (not sure if its important for my question but nevertheless a detail I am giving). Now I have a few questions:
How can I scale this tcp/ip based component. This component is deployed on a server and is listening to a port. In future if there's more data that is envisaged at this point that comes from the C++ system we should be able to scale this java component.
What about security. One thing which I can probably do is employ this communication on secure sockets or probably get encrypted data (any particular encryption that I could use here??). Any other way to take care of security?
There is also a requirement of high availability to be satisfied. How do I handle that? How could I possible have redundancy here?
Yes, we are working on the system architecture of a product and therefore, I was wondering if some experienced architect or designer could help me.
How can I scale this tcp/ip based component. This component is deployed on a server and is listening to a port. In future if there's more data that is envisaged at this point that comes from the C++ system we should be able to scale this java component.
You normally use a network load-balancer to scale these kind of services across multiple servers. That load-balancer can distribute load using a variety of algorithms, such as:
CPU load (usually measured with snmp)
Client ip address (if you need persistence when mapping clients to your services)
Number of active sockets
etc
Look at HAProxy for a popular open-source load-balancer. F5 has the most popular commercial load-balancer solution.
What about security. One thing which I can probably do is employ this communication on secure sockets or probably get encrypted data (any particular encryption that I could use here??). Any other way to take care of security?
As mentioned, SSL is an option, but understand that is a big performance hit on your services if you encrypt on the same hardware that is performing your customer services. One option along these lines is using a commercial load-balancer that implements SSL in hardware; that load-balancer would then forward unencrypted sockets to your TCP services farm.
Under some circumstances you can use IPSec network-level encryption; often, this is another network hardware solution. Typically your clients will download an IPSec application that resides on their PC... then they make a connection into your IPSec server, which encrypts between their client and your IPSec termination point
SSH Tunneling with port-forwarding (low-tech solution)
tcpcrypt looks interesting as a future technology, but I'm not sure how mature it is right now.
There is also a requirement of high availability to be satisfied. How do I handle that? How could I possible have redundancy here?
A lot depends on what you mean by high availability, and what kind of recovery timing you need. At a high level, you have a few options:
DNS-based HA works if you don't need client to socket mapping persistence; if you use DNS, you need to be willing to accept typical DNS A-record timeouts (usually people don't go lower than ~5 minutes / 300 seconds). This also assumes you find a way to synchronize your databases across multiple sites.
Load-balancer solutions. Same issue with synchronizing back-end databases
To do any kind of HA, you probably want to hire a consultant that has a proven track record of implementing these services (if you don't have this kind of resource in-house).

Deliver multicast to several different geo-locations

I need to use one logical PGM based multicast address in application while enable such application "seamlessly" running across several different geo-locations (i.e. think US/Europe/Australia).
Application is quite throughput (several million biz. messages a day) and latency demanding whith a lot of small but very frequently send messages. Classical Atom pub will not work here due some external limits of latencies.
I have come up with several options to connect those datacenters but can’t find the best one.
Options which I have considered are:
1) Forward multicast messages via VPN’s (can VPN handle such big load).
2) Translate all multicast messages to “wrapper messages” and forward them via AMQP.
3) Write specialized in-house gate which tunnels multicast messages via TCP to other two locations.
4) Any other solution
I would prefer option 1 as it does not need additional code writes from devs. but I’m afraid it will not be reliable connection.
Are there any rules to apply for such connectivity?
What the best network configuration with regard to the geographical configuration is for above constrains.
Just wanted to say hello :)
As for the topic, we have not much experience with multicasting over WAN, however, my feeling is that PGM + WAN + high volume of data would lead to retransmission storms. VPN won't make this problem disappear as all the Australian receivers would, when confronted with missing packets, send NACKS to Europe etc.
PGM specification does allow for tree structure of nodes for message delivery, so in theory you could place a single node on the receiving side that would in its turn re-multicast the data locally. However, I am not sure whether this kind of functionality is available with MS implementation of PGM. Optionally, you can place a Cisco router with PGM support on the receiving side that would handle this for you.
In any case, my preference would be to convert the data to TCP stream, pass it over the WAN and then convert it back to PGM on the other side. Some code has to be written, but no nasty surprises are to be expected.
Martin S.
at CohesiveFT we ran into a very similar problem when we designed our "VPN-Cubed" product for connecting multiple clouds up to servers behind our own firewall, in one VPN. We wanted to be able to run apps that talked to each other using multicast, but for example Amazon EC2 does not support multicast for reasons that should be fairly obvious if you consider the potential for network storms across a whole data center. We also wanted to route traffic across a wide area federation of nodes using the internet.
Without going into too much detail, the solution involved combining tunneling with standard routing protocols like BGP, and open technologies for VPNs. We used RabbitMQ AMQP to deliver messages in a pubsub style without needing physical multicast. This means you can fake multicast over wide area subnets, even across domains and firewalls, provided you are in the VPN-Cubed safe harbour. It works because it is a 'network overlay' as described in technical note here: http://blog.elasticserver.com/2008/12/vpn-cubed-technical-overview.html
I don't intend to actually offer you a specific solution, but I do hope this answer gives you confidence to try some of these approaches.
Cheers, alexis

Resources