Scaling a TCP/IP based system and ensuring high availability - scalability

I have a TCP/IP based component which is communicating with a c++ based system. In fact it is reading raw bytes from that system and then marshaling those raw bytes in objects and storing it in the DB. This multi-threaded tcp/ip based component is in java and could be deployed on a dual core or quad core processor (not sure if its important for my question but nevertheless a detail I am giving). Now I have a few questions:
How can I scale this tcp/ip based component. This component is deployed on a server and is listening to a port. In future if there's more data that is envisaged at this point that comes from the C++ system we should be able to scale this java component.
What about security. One thing which I can probably do is employ this communication on secure sockets or probably get encrypted data (any particular encryption that I could use here??). Any other way to take care of security?
There is also a requirement of high availability to be satisfied. How do I handle that? How could I possible have redundancy here?
Yes, we are working on the system architecture of a product and therefore, I was wondering if some experienced architect or designer could help me.

How can I scale this tcp/ip based component. This component is deployed on a server and is listening to a port. In future if there's more data that is envisaged at this point that comes from the C++ system we should be able to scale this java component.
You normally use a network load-balancer to scale these kind of services across multiple servers. That load-balancer can distribute load using a variety of algorithms, such as:
CPU load (usually measured with snmp)
Client ip address (if you need persistence when mapping clients to your services)
Number of active sockets
etc
Look at HAProxy for a popular open-source load-balancer. F5 has the most popular commercial load-balancer solution.
What about security. One thing which I can probably do is employ this communication on secure sockets or probably get encrypted data (any particular encryption that I could use here??). Any other way to take care of security?
As mentioned, SSL is an option, but understand that is a big performance hit on your services if you encrypt on the same hardware that is performing your customer services. One option along these lines is using a commercial load-balancer that implements SSL in hardware; that load-balancer would then forward unencrypted sockets to your TCP services farm.
Under some circumstances you can use IPSec network-level encryption; often, this is another network hardware solution. Typically your clients will download an IPSec application that resides on their PC... then they make a connection into your IPSec server, which encrypts between their client and your IPSec termination point
SSH Tunneling with port-forwarding (low-tech solution)
tcpcrypt looks interesting as a future technology, but I'm not sure how mature it is right now.
There is also a requirement of high availability to be satisfied. How do I handle that? How could I possible have redundancy here?
A lot depends on what you mean by high availability, and what kind of recovery timing you need. At a high level, you have a few options:
DNS-based HA works if you don't need client to socket mapping persistence; if you use DNS, you need to be willing to accept typical DNS A-record timeouts (usually people don't go lower than ~5 minutes / 300 seconds). This also assumes you find a way to synchronize your databases across multiple sites.
Load-balancer solutions. Same issue with synchronizing back-end databases
To do any kind of HA, you probably want to hire a consultant that has a proven track record of implementing these services (if you don't have this kind of resource in-house).

Related

What are possible Scalability options for an application supporting ONLY Single TCP Socket Connection?

There is a legacy implementation(to an extent company proprietary in Pascal, C with some java macros) which processes TCP Socket based requests from TCP client application. It supports multiple client applications(around 5K) connecting over TCP Socket, however, it only supports single socket connection with backend(database). There are two instances of the server, so in total, it supports 10K client applications over two TCP Socket connection with database. All database related communication happens in synchronous manner over single socket connection. There are massive issues in this application, especially higher RTT(Round Trip Time) and occasional outages due to back-pressure. We have an ops team for such issues. They mostly resolve them by restarting the server. Hardly, we have people in our team who know coding details of this application and there is not much documentation. As this is a critical application we can not afford messing with it. We don't want to touch the code at least for now. This even becomes more critical due to shift in business priorities. There is a need to add another 30K client applications of another business with this setup.
Task before us is to integrate it with another application which is based on microservice architecture with middleware using RabbitMQ. This is a customer facing application sensitive to higher QoS. We can not afford outage & downtime in it. As part of this integration, there is a need to process request messages coming from the above legacy application over TCP Socket before passing them to database. In other words, we want to introduce a component which would process requests of legacy application before handing over to database. This additional process is part of our client request. Some of the processing requirement is very intensive and resource hungry in terms of CPU Cycle, Memory and socket i/o. As a result, there are chances, such processing may lead to server downtime & higher RTT. Our this layer is very flexible, we can easily add more server or replace faulty ones. But, this doesn't sound very efficient in this integration as we are limited with single socket connection of legacy application. So in total at max, we can only have 2(+ 6 for new 30k client application) servers. This is our cause of concern.
I want know, what different possible options are available to address high availability, scalability and latency issues of such integration? Especially with limitation of single TCP socket connection, how can we make this integration efficient, something which can handle back-pressure, better application uptime etc.
We were thinking of leveraging RabbitMQ, Layer 4 Load balancer(like haProxy, NginX), IPVS, NAT etc.. But all lead toward making some changes(or not very efficient technique) in the legacy code, which we don't want.

why and when i need mqtt broker for IOT/M2M application

Just asking one silly question, hope someone can answer this.
I'm bit confused regarding MQTT broker. Basically, the confusion is, there are so many things being used for data storing, transfer and processing (like Flume, HDInsight, Spark etc). So, when and why I need to use one MQTT broker?
If I would like to use Windows 10 IoT application with HiveMQ, from where can I get the details? how to use it? How I get benefit out of this MQTT broker? Can I not send data from my IoT application directly using Azure or HDFS? So, how MQTT broker fits into it or helping me to achieve something?
I'm new to all these and tried to find some tutorials, however, I'm not getting anything proper. Please explain it in more details or give some tutorials for this?
MQTT is a client-server protocol for pub-sub based transport that has a comparatively small overhead, and thus applicable to mobile and IoT applications (unlike Flume, etc.). The MQTT broker is basically a server that handles messaging to/from MQTT clients and among them. The functionality pretty much stops at the transport layer, even though various MQTT add-ons exist.
If you are looking to implement a solution that would reliably transfer data from your IoT devices to the back-end system for processing, I would suggest you take a look into Kaa open-source IoT platform. It goes much further than MQTT by providing not only the transport layer, suitable for low-power IoT devices, but also a solid chunk of the application level logic (including the object bindings for your application-level data structures, temporary data persistence, etc.).
Here is a link to a webinar that explains how to build a scalable IoT analytics system with Kaa and Spark in less than an hour.
This is an architectural choice. IoT applications are possible without MQTT but there are some advantages when using MQTT. If you are completely new to MQTT, take a look at this in-depth MQTT series: http://forkbomb-blog.de/2015/all-you-need-to-know-about-mqtt
Basically the main architectural advantage is publish / subscribe designed for low-latency, high throughput (mobile) communication with minimal protocol overhead (which is important if bandwidth is at a premium). You can completely decouple consumers and producers.
HDFS is the (distributed) Hadoop file system and is the foundation for Map / Reduce processing. It is not comparable to a MQTT broker. The MQTT broker could write to the HDFS, though (in case of HiveMQ with a custom plugin).
Basically MQTT is a protocol while the products you are mentioning are, well, products which solve completely different problems:
Flume is basically used for log aggregation at scale. You won't use MQTT for that, at least there is not too much advantage because this is typically done in backend applications.
Spark and Hadoop shine at Big Data crunching. They are a framework and not a ready to use solution. They are not really comparable to MQTT. Often MQTT brokers like HiveMQ are used in conjunction with these, Spark / Hadoop for data processing and HiveMQ for communication.
I hope this helps you getting started. Best would be to read about typical use cases of all these technologies, this is a bit too broad for a single SO answer.
MQTT is a data transport, so the usual thing I have to compare it with is HTTP. HTTP has two important characteristics, a) It goes from one point to another, b) It is request/response, so only one end can start a data transfer. MQTT connects many end points to many end points, and either end can start a data transfer. So, if you have just one device and only one service or person that will ever access it, and only by polling, then HTTP is great. MQTT means many devices can post data to many services or people, AND the other way around. Your question assumes that your data is always going to land up in some sort of data store, but many interactions are about events and responding to them immediately, like ringing a doorbell, or lowering the landing gear. In these cases you will often want to both record the data, and have an immediate action occur, like your phone making a doorbell noise.
Finally, you send data to MQTT semantically, rather than by IP address.
This means that your services subscribes to /mikeshouse/doorbell rather than polling 192.168.22.4, which is a huge gain once you have a number of devices.

So many persistent connections to the server. Is that the right way?

I would like to understand networking services with a large user base a bit better so that I know how to approach a project I am busy with.
The following statements that I make may be incorrect but they still lead to the question that I want to ask...
Please consider Skype and TeamViewer clients. It seems that both keep persistent network connections open to their respective servers. They use these persistent connections to initiate additional connections. Some of these connections are created by means of Hole Punching if the clients are behind NATs. They are then used for direct Peer-to-Peer communications.
Now according to http://expandedramblings.com/index.php/skype-statistics/ there are 300 million users using Skype and 4.9 million daily active users. I would assume that most of that 4.9 million users will most probably have their client apps running most of the day. That is a lot of connections to the Skype servers that are open at any given time.
So to my question; Is this feasible or at least acceptable? I mean, wouldn't it be better to not have a network connection open while idle and aspecially when there are so many connections open to the servers at once? The only reason I can think is that it would be the only way to properly do Hole Punching. Techically, how is this achieved on the server side?
Is this feasible or at least acceptable?
Feasible it certainly is, you mention already two popular apps that do it, so it is very doable in practice.
As for acceptable, to start no internet authority (e.g. IETF) has ever said it is unacceptable to have long-lived connections even with low traffic.
Furthermore, the only components for which this matters are network elements that keep connection/flow state. These are for sure the endpoints and so-called middleboxes like NAT and firewalls. For the client this is only one connection, the server is usually fine tuned by the application developers (who made this choice) themselves, so for these it is acceptable. For middleboxes it's simple: they have no choice, they're designed to just work with all kind of flows, including long-lived persistent connections.
I mean, wouldn't it be better to not have a network connection open while idle and aspecially when there are so many connections open to the servers at once?
Not at all. First of all, that could be 'much' slower as you'd need to set up a full connection before each control-plane call. This is especially noticeable if your RTT is big or if the servers do some complicated connection proxying/redirection for load-balancing/localization purposes.
Next to that this would historically make incoming calls difficult for a huge amount of users. Many ISP's block/blocked unknown incoming connections from the internet by means of a firewall. Similar, if you are behind a NAT device that does not support UPnP or PCP you can't open a port to listen on for your public IP address. So you need it even aside from hole-punching.
The only reason I can think is that it would be the only way to
properly do Hole Punching. Techically, how is this achieved on the
server side?
Technically you can't do proper hole-punching as soon as the NAT devices maintain a full <src-ip,src-port,dest-ip,dest-port,protocol> (classical 5-tuple) flow match. Then the best you can do with 'hole punching' is set up a proxy between peers.
What hole-punching relies on is that the NAT flow lookup is only looking at <src-ip,src-port,protocol> upstream and <dest-ip,dest-port,protocol> downstream to do the translation. In that case both clients just set up a connection to the server, their ip and port gets translated and the server passes this to the other client. The other client can now start sending packets to that translated <ip,port> combination which should work because NAT ignores the server's ip/port. But even if the particular NAT would work like this, some security device (e.g. stateful firewall) might detect session hi-jacking and drop this anyway.
Nowadays you rather use UPnP to open up a port to listen on your public IP which is much easier if supported.

ZeroMQ REQ/REP the other way round

I have a strange szenario:
Webserver / Appserver (Java) sends requests to many different satellite systems (on customers site). Only satellite systems can initiate connection due to firewall rules.
The model I think should be something like REQ/REP, but here the REQuester have to bind and the REPlyer would have to connect.
Is this possible and a stable architecture?
Are there better solutions? (We first had WebSockets in mind...)
Remark: we don't have to use Java on both ends. To be precise on customers site we have Delphi, but we could bridge it somehow.
The model I think should be something like REQ/REP, but here the
REQuester have to bind and the REPlyer would have to connect.
This will be problematic. When the server initiates the connection, it must be aware of all peers and their bind address. Not a big deal for a handful of peers, but for many peers changing constantly, it's a mess.
Only satellite systems can initiate connection due to firewall rules.
If that's the case, your mileage will vary with WebSockets; google around, lots of info on this.
Are there better solutions?
Well, with ZeroMq, one solution that comes to mind to support client request initiation is this:
Server binds with ROUTER
Clients connect with DEALER.
This approach offers bi-directional request/reply, does not block (asynchronous), and eliminates the client-side bind problem mentioned in your question. Here, the server binds, and either side can initiate the conversation.
I recommend reading this section in the guide, it covers extended async request/reply and message enveloping, important when using ROUTER/DEALER sockets.

Network protocol for surviving client IP address/network changes, among other problems

Persistent connection to a mobile device is difficult. Signal conditions can change rapidly, and connectivity types can also change. For instance, I may want to stream audio to my phone as I leave my apartment (WiFi), take a bus (WiMax/LTE), transfer to the subway (intermittent CDMA, sometimes roaming on another carrier), and walk to work (WiMax/LTE and back to WiFi). On this 15-minute trip alone I use at least 4 different IP addresses/networks, and experience all sorts of connectivity issues along the way. However, there is rarely a total loss of connectivity to the Internet, and the times that the signal condition makes connectivity problematic only happen for small periods of time.
I'm looking for a protocol that allows roaming from network to network and is very tolerant of harsh network conditions, while maintaining virtual end-to-end connectivity. This protocol would enable connections between a (usually) mobile device and some sort of proxy server which would relay regular TCP/UDP connections on behalf of the mobile device, over this tolerant protocol.
This protocol would sit around layer 3, and maybe even enable creation of virtual network interfaces that are tunneled through it. Perhaps there is a VPN or SOCKS proxy solution that already meets these needs.
Does such a protocol already exist?
If not, I'm probably going to come up with one, but would rather piggy-back off of existing efforts first.
There are many efforts within the internetworking community to address precisely these "network mobility" concerns.
In particular, Mobile IP (and its IPv6 big sister, Proxy Mobile IPv6) is a broad term for efforts to make IP addresses themselves portable across networks, however I doubt these technologies have reached sufficient maturation/deployment for production use today.
To undertake such mobility without support from the network requires a means of the host announcing to you its new address in an authenticated manner; this is what the Host Identity Protocol is designed for, but it is still at the "experimental" stage of the RFC process. From the abstract of RFC 5201:
HIP allows consenting hosts to securely establish and maintain shared
IP-layer state, allowing separation of the identifier and locator
roles of IP addresses, thereby enabling continuity of communications
across IP address changes.
There are several open-source implementations that are known to interoperate. Without claiming that this is a complete list, nor vouching for any of them (they're just a few picked off a Google search for "Host Identity Protocol implementations"), there is:
OpenHIP for multiple operating systems;
HIPL for Linux;
cutehip for Java;
HIP for inter.net for *BSD/Linux.

Resources