What are possible Scalability options for an application supporting ONLY Single TCP Socket Connection?

What are possible Scalability options for an application supporting ONLY Single TCP Socket Connection? - scalability

There is a legacy implementation(to an extent company proprietary in Pascal, C with some java macros) which processes TCP Socket based requests from TCP client application. It supports multiple client applications(around 5K) connecting over TCP Socket, however, it only supports single socket connection with backend(database). There are two instances of the server, so in total, it supports 10K client applications over two TCP Socket connection with database. All database related communication happens in synchronous manner over single socket connection. There are massive issues in this application, especially higher RTT(Round Trip Time) and occasional outages due to back-pressure. We have an ops team for such issues. They mostly resolve them by restarting the server. Hardly, we have people in our team who know coding details of this application and there is not much documentation. As this is a critical application we can not afford messing with it. We don't want to touch the code at least for now. This even becomes more critical due to shift in business priorities. There is a need to add another 30K client applications of another business with this setup.
Task before us is to integrate it with another application which is based on microservice architecture with middleware using RabbitMQ. This is a customer facing application sensitive to higher QoS. We can not afford outage & downtime in it. As part of this integration, there is a need to process request messages coming from the above legacy application over TCP Socket before passing them to database. In other words, we want to introduce a component which would process requests of legacy application before handing over to database. This additional process is part of our client request. Some of the processing requirement is very intensive and resource hungry in terms of CPU Cycle, Memory and socket i/o. As a result, there are chances, such processing may lead to server downtime & higher RTT. Our this layer is very flexible, we can easily add more server or replace faulty ones. But, this doesn't sound very efficient in this integration as we are limited with single socket connection of legacy application. So in total at max, we can only have 2(+ 6 for new 30k client application) servers. This is our cause of concern.
I want know, what different possible options are available to address high availability, scalability and latency issues of such integration? Especially with limitation of single TCP socket connection, how can we make this integration efficient, something which can handle back-pressure, better application uptime etc.
We were thinking of leveraging RabbitMQ, Layer 4 Load balancer(like haProxy, NginX), IPVS, NAT etc.. But all lead toward making some changes(or not very efficient technique) in the legacy code, which we don't want.

Related

So many persistent connections to the server. Is that the right way?

I would like to understand networking services with a large user base a bit better so that I know how to approach a project I am busy with.
The following statements that I make may be incorrect but they still lead to the question that I want to ask...
Please consider Skype and TeamViewer clients. It seems that both keep persistent network connections open to their respective servers. They use these persistent connections to initiate additional connections. Some of these connections are created by means of Hole Punching if the clients are behind NATs. They are then used for direct Peer-to-Peer communications.
Now according to http://expandedramblings.com/index.php/skype-statistics/ there are 300 million users using Skype and 4.9 million daily active users. I would assume that most of that 4.9 million users will most probably have their client apps running most of the day. That is a lot of connections to the Skype servers that are open at any given time.
So to my question; Is this feasible or at least acceptable? I mean, wouldn't it be better to not have a network connection open while idle and aspecially when there are so many connections open to the servers at once? The only reason I can think is that it would be the only way to properly do Hole Punching. Techically, how is this achieved on the server side?

Is this feasible or at least acceptable?
Feasible it certainly is, you mention already two popular apps that do it, so it is very doable in practice.
As for acceptable, to start no internet authority (e.g. IETF) has ever said it is unacceptable to have long-lived connections even with low traffic.
Furthermore, the only components for which this matters are network elements that keep connection/flow state. These are for sure the endpoints and so-called middleboxes like NAT and firewalls. For the client this is only one connection, the server is usually fine tuned by the application developers (who made this choice) themselves, so for these it is acceptable. For middleboxes it's simple: they have no choice, they're designed to just work with all kind of flows, including long-lived persistent connections.
I mean, wouldn't it be better to not have a network connection open while idle and aspecially when there are so many connections open to the servers at once?
Not at all. First of all, that could be 'much' slower as you'd need to set up a full connection before each control-plane call. This is especially noticeable if your RTT is big or if the servers do some complicated connection proxying/redirection for load-balancing/localization purposes.
Next to that this would historically make incoming calls difficult for a huge amount of users. Many ISP's block/blocked unknown incoming connections from the internet by means of a firewall. Similar, if you are behind a NAT device that does not support UPnP or PCP you can't open a port to listen on for your public IP address. So you need it even aside from hole-punching.
The only reason I can think is that it would be the only way to
properly do Hole Punching. Techically, how is this achieved on the
server side?
Technically you can't do proper hole-punching as soon as the NAT devices maintain a full <src-ip,src-port,dest-ip,dest-port,protocol> (classical 5-tuple) flow match. Then the best you can do with 'hole punching' is set up a proxy between peers.
What hole-punching relies on is that the NAT flow lookup is only looking at <src-ip,src-port,protocol> upstream and <dest-ip,dest-port,protocol> downstream to do the translation. In that case both clients just set up a connection to the server, their ip and port gets translated and the server passes this to the other client. The other client can now start sending packets to that translated <ip,port> combination which should work because NAT ignores the server's ip/port. But even if the particular NAT would work like this, some security device (e.g. stateful firewall) might detect session hi-jacking and drop this anyway.
Nowadays you rather use UPnP to open up a port to listen on your public IP which is much easier if supported.

is there restriction for opening imap connection from same ip address?

Hi I am implementing Email Client Application. My requirement is i need to monitor all the mailboxes available in specified IMAP server. I am created separate TCP Connection for each mailboxes. But i am getting disconnected from IMAP Server. I am trying Gmail/yahoo for my testing purpose. Is there any restriction to open multiple connection from same ip to particular IMAP Server? Particularly in Gmail and Yahoo.
or is there anyway to Monitor all the mailboxes in Single Connection without using IMAP-NOTIFY seems it does not supported in both Gmail/Yahoo...
Please Help me out...

This is something which I have answered on stackoverflow before, but which is now only available via the wayback machine. The question was about how to "kill too many parallel IMAP connections". Reprinted below; the core takeaway message is that for some reason, most server administrators prefer to have smaller number of short-lived connections instead of more connections which are active over longer period of time, yet they spend most of their time silently idling in the background. What they do not get is that the IMAP protocol is designed with long-lived connections in mind, and trying to prevent that will lead to wasting resources because the clients will constantly resync mailboxes as they are hopping among them.
The original answer follows:
Nope, it's a very wrong idea. IMAP is designed so that monitoring a single mailbox takes one connection; in most IMAP server implementations, this means a single process. However, unless the client the user is using is terribly broken, all these connections enter the IDLE mode. In IDLE, the clients are passively notified about any updates to the mailbox state. If you disable these connections, the clients would have to activelly poll for changes in many mailboxes. Now decide for yourself -- what is worse, having ten processes sitting idle, or one process doing heavy polling every two minutes? Which of these solutions would consume more energy, CPU time and IO operations? That's for the number of parallel connections.
The second question was about the long-lived connections. Again, this is a critical aspect of IMAP -- each connection carries a lot of associated state information which is rather expensive to obtain. Unless your server implements certain extensions and your clients use them (ESEARCH, CONDSTORE, QRESYNC are the crucial bits), opening a mailbox can require O(n) operations. I don't know how many messages your users have, but do you really want to transfer e.g. message flags for 250k messages when you decided to kill a connection because it has been active for "too long"?
Finally, any reasonable IMAP server vendor offers a way to configure a per-user session limit on the number of concurrent processes. Using that is much better than maintaining a script for ad-hoc killing of "unused" connections.
If you would like to learn more about the synchronization process, my thesis about using IMAP on clients with flaky network and limited resources describes what the clients have to do in order to show an updated view of mailboxes to their users.

Scaling a TCP/IP based system and ensuring high availability

I have a TCP/IP based component which is communicating with a c++ based system. In fact it is reading raw bytes from that system and then marshaling those raw bytes in objects and storing it in the DB. This multi-threaded tcp/ip based component is in java and could be deployed on a dual core or quad core processor (not sure if its important for my question but nevertheless a detail I am giving). Now I have a few questions:
How can I scale this tcp/ip based component. This component is deployed on a server and is listening to a port. In future if there's more data that is envisaged at this point that comes from the C++ system we should be able to scale this java component.
What about security. One thing which I can probably do is employ this communication on secure sockets or probably get encrypted data (any particular encryption that I could use here??). Any other way to take care of security?
There is also a requirement of high availability to be satisfied. How do I handle that? How could I possible have redundancy here?
Yes, we are working on the system architecture of a product and therefore, I was wondering if some experienced architect or designer could help me.

How can I scale this tcp/ip based component. This component is deployed on a server and is listening to a port. In future if there's more data that is envisaged at this point that comes from the C++ system we should be able to scale this java component.
You normally use a network load-balancer to scale these kind of services across multiple servers. That load-balancer can distribute load using a variety of algorithms, such as:
CPU load (usually measured with snmp)
Client ip address (if you need persistence when mapping clients to your services)
Number of active sockets
etc
Look at HAProxy for a popular open-source load-balancer. F5 has the most popular commercial load-balancer solution.
What about security. One thing which I can probably do is employ this communication on secure sockets or probably get encrypted data (any particular encryption that I could use here??). Any other way to take care of security?
As mentioned, SSL is an option, but understand that is a big performance hit on your services if you encrypt on the same hardware that is performing your customer services. One option along these lines is using a commercial load-balancer that implements SSL in hardware; that load-balancer would then forward unencrypted sockets to your TCP services farm.
Under some circumstances you can use IPSec network-level encryption; often, this is another network hardware solution. Typically your clients will download an IPSec application that resides on their PC... then they make a connection into your IPSec server, which encrypts between their client and your IPSec termination point
SSH Tunneling with port-forwarding (low-tech solution)
tcpcrypt looks interesting as a future technology, but I'm not sure how mature it is right now.
There is also a requirement of high availability to be satisfied. How do I handle that? How could I possible have redundancy here?
A lot depends on what you mean by high availability, and what kind of recovery timing you need. At a high level, you have a few options:
DNS-based HA works if you don't need client to socket mapping persistence; if you use DNS, you need to be willing to accept typical DNS A-record timeouts (usually people don't go lower than ~5 minutes / 300 seconds). This also assumes you find a way to synchronize your databases across multiple sites.
Load-balancer solutions. Same issue with synchronizing back-end databases
To do any kind of HA, you probably want to hire a consultant that has a proven track record of implementing these services (if you don't have this kind of resource in-house).

What do a benefit from changing from blocking to non-blocking sockets?

We have an application server developed with Delphi 2010 and Indy 10. This server receives more than 50 requests per second and it works well. But in some cases, it seems to me that Indy is very obscure. Their components are good, but sometimes I found myself digging into the source code only to understand a simple thing. Indy lacks on good documentation and good support.
The last thing that i came across was a big problem for me: I must detect when a client disconnects non gracefully (When the the client crashes or shutdown, for instance. Not telling the server that it will disconnect) and indy was not able to do that. If I want that, I will have to develop a algorithm like heartbeat, pooling or TCP keep-alive. I do not want to spend more time doing a, at least I think, component job. After a few study, I found out that this is not Indy's fault, but this is an issue of all blocking sockets components.
Now I am really thinking of changing the core of the Server to another good suite. I must admit I am tending to use a non-blocking socket. Based on that, I have some questions:
What do a benefit from changing from blocking to non-blocking sockets?
Will I be able to detect client disconnects (non gracefully)?
What component suite has the best product? By best product I mean: fast, good support, good tools and easy to implement.
I know this must be a subjective question, but I really want to hear that from you. My first question is the one I care most. I do not care if I have to pay 100, 500, 1000, 10000 dollars, but I want a complete solution. For now, I am thinking about Ip*works .
EDIT
I think some guys are not understand what I want. I don't want to create my own socket. I have been working with sockets for a long time and I am getting tired of it. Really.
And non-blocking sockets CAN detect client disconnects. That is a fact and it has good documentation all over the internet. A non-blocking socket checks the socket state for new incoming data all the time, and it makes possible to detect that the socket is not valid. This is not a heartbeat algorithm. A heartbeat algorithm is used on client side and it sends periodically packets (aka keep-alive) to the server to tells it is still alive.
EDIT
I am not make myself clear. Maybe because English is not my main language. I am not saying that it is possible to detect a dropped connection without trying to send or receiving data from a socket. What I am saying is that every non-blocking socket is able to do that because they constantly tries to read from the socket for new incoming data. Why is that so hard to understand? If you guys download and run ip*works demos, in special, the echoserver and echoclient ones (both use TCP) you can test by yourselves. I already tested it, and it works like I expected to do. Even if you use the old TCPSocketServer and TCPSocketClient in a non-blocking mode you will see what I meant.

"What do a benefit from changing from blocking to non-blocking sockets? Will I be able to detect client disconnects (non gracefully)?"
Just my two cents to get the ball rolling on this question - I'm not a socket EXPERT, but I do have a good deal of experience with them. If I'm mistaken, I'm sure someone will correct me... :-)
I assume that since you're running a server using blocking sockets with 50 connections per second, you have a threading mechanism in place to handle client requests. If so, you don't really stand to gain anything from non-blocking sockets. On the contrary - you will have to change your server logic to be event driven- based on events fired in your main thread from the non-blocking sockets, or use constant polling to know what your sockets are up to.
Non-blocking sockets can't detect clients disconnecting without notification any more than blocking sockets can - they don't have telepathic powers... The nature of the TCP/IP 'conversation' between client and server is the same - blocking and non-blocking is only with respect to your application's interaction with the socket connection conducting the 'conversation'.
If you need to purge dead connections, you need to implement a heartbeat or timeout mechanism on your socket (I've never seen a modern socket implementation that didn't support timeouts).

What do a benefit from changing from blocking to non-blocking sockets?
Increased speed, availability, and throughput (from my experience). I had an IndySockets client that was getting about 15 requests per second and when I went directly to asynchronous sockets the throughput increased to about 90 requests per second (on the same machine). In a separate benchmark test on a server at a data-center with a 30 Mbit connection I was able to get more than 300 requests per second.
Will I be able to detect client disconnects (non gracefully)?
That's one thing I haven't had to try yet, since all of my code has been on the client side.
What component suite has the best product? By best product I mean: fast, good support, good tools and easy to implement.
You can build your own socket client in a couple of days and it can be very robust and fast... much faster than most of the stuff I've seen "off the shelf". Feel free to take a look at my asynchronous socket client: http://codesprout.blogspot.com/2011/04/asynchronous-http-client.html
Update:
(Per Mikey's comments)
I'm asking you for a generic, technical explanation of how NBS increase throughput as opposed to a properly designed BS server.
Let's take a high load server as an example: say your server is supposed to handle 1000 connections at any given time, with blocking sockets you would have to create 1000 threads and even if they're mostly idle, the CPU will still spend a lot of time context switching. As the number of clients increases you will have to increase the number of threads in order to keep up and the CPU will inevitably increase the context switching. For every connection you establish with a blocking socket, you will incur the overhead of spawning of a new thread and you eventually you will incur the overhead of cleaning up after the thread. Of course, the first thing that comes to mind is: why not use the ThreadPool, you can reuse the threads and reduce the overhead of creating/cleaning-up of threads.
Here is how this is handled on Windows (hence the .NET connection): sure you could, but the first thing you'll notice with the .NET ThreadPool is that it has two types of threads and it's not a coincidence: user threads and I/O completion port threads. Asynchronous sockets use the IO completion ports which "allows a single thread to perform simultaneous I/O operations on different handles, or even simultaneous read and write operations on the same handle."(1) The I/O completion port threads are specifically designed to handle I/O in a much more efficient way than you would ever be able to achieve if you used the user threads in ThreadPool, unless you wrote your own kernel-mode driver.
"The completion port uses some special voodoo to make sure only a specific number of threads can run at once — if one thread blocks in kernel-mode, it will automatically start up another one."(2)
There are other advantages also: "in addition to the nonblocking advantage of the overlapped socket I/O, the other advantage is better performance because you save a buffer copy between the TCP stack buffer and the user buffer for each I/O call." (3)

I am using Indy and Synapse TCP libraries with good results for some years now, and did not find any showstoppers in them. I use the libraries in threads - client and server side, stability and performance was not a problem. (Six thousand request and response messages per second and more with the server running on the same system are typical.)
Blocking sockets are very useful if the protocol is more advanced than a simple 'send a string / receive a string'. Non-blocking sockets cause a higher coupling of message protocol handlers with the socket read / write logic, so I quickly moved away from non-blocking code.
No library can overcome the limitations of the TCP/IP protocol regarding detection of connection loss. Only trying to read or send data can tell wether the connection is still present.

In Windows, there is a third option which is overlapped I/O. Non-blocking sockets are essential a model using Windows messages developed to avoid single-threaded GUI apps to become "blocked" while waiting for data. A modern application IMHO would be better designed using threads and overlapped I/O.
See for example http://support.microsoft.com/kb/181611

Aahhrrgghh - the myth of being able to always detect "dropped" connections. If you pull the power on a machine with a client connection then the server cannot tell, without sending data, that the connection is "dead". The is through the design of the TCP protocol. Don't take my word for it - read this article (Detection of Half-Open (Dropped) TCP/IP Socket Connections).

This article explains the main differences between blocking and non-blocking:
Introduction to Indy, by Chad Z. Hower
Pros of Blocking
Easy to program - Blocking is very easy to program. All user code can
exist in one place, and in a
sequential order.
Easy to port to Unix - Since Unix uses blocking sockets, portable code
can be written easily. Indy uses this
fact to achieve its single source
solution.
Work well in threads - Since blocking sockets are sequential they
are inherently encapsulated and
therefore very easily used in threads.
Cons of Blocking
User Interface "Freeze" with clients - Blocking socket calls do not
return until they have accomplished
their task. When such calls are made
in the main thread of an application,
the application cannot process the
user interface messages. This causes
the User Interface to "freeze" because
the update, repaint and other messages
cannot be processed until the blocking
socket calls return control to the
applications message processing loop.
He also wrote:
Blocking is NOT Evil
Blocking sockets have been repeatedly
attacked with out warrant. Contrary to
popular belief, blocking sockets are
not evil.
It is not is an issue of all blocking sockets components that they are unable to detect a client disconnect. There is no technical advantage on the side of non-blocking components in this area.

What is the most common approach for designing large scale server programs?

Ok I know this is pretty broad, but let me narrow it down a bit. I've done a little bit of client-server programming but nothing that would need to handle more than just a couple clients at a time. So I was wondering design-wise what the most mainstream approach to these servers is. And if people could reference either tutorials, books, or ebooks.
Haha ok. didn't really narrow it down. I guess what I'm looking for is a simple but literal example of how the server side program is setup.
The way I see it: client sends command: server receives command and puts into queue, server has either a single dedicated thread or a thread pool that constantly polls this queue, then sends the appropriate response back to the client. Is non-blocking I/O often used?
I suppose just tutorials, time and practice are really what I need.
*EDIT: Thanks for your responses! Here is a little more of what I'm trying to do I suppose.
This is mainly for the purpose of learning so I'd rather steer away from use of frameworks or libraries as much as I can. Take for example this somewhat made up idea:
There is a client program it does some function and constantly streams the output to a server(there can be many of these clients), the server then creates statistics and stores most of the data. And lets say there is an admin client that can log into the server and if any clients are streaming data to the server it in turn would stream that data to each of the admin clients connected.
This is how I envision the server program logic:
The server would have 3 Threads for managing incoming connections(one for each port listening on) then spawning a thread to manage each connection:
1)ClientConnection which would basically just receive output, which we'll just say is text
2)AdminConnection which would be for sending commands between server and admin client
3)AdminDataConnection which would basically be for streaming client output to the admin client
When data comes in from a client to the server the server parses what is relevant and puts that data in a queue lets say adminDataQueue. In turn there is a Thread that watches this queue and every 200ms(or whatever) would check the queue to see if there is data, if there is, then cycle through the AdminDataConnections and send it to each.
Now for the AdminConnection, this would be for any commands or direct requests of data. So you could request for statistics, the server-side would receive the command for statistics then send a command saying incoming statistics, then immediately after that send a statistics object or data.
As for the AdminDataConnection, it is just the output from the clients with maybe a few simple commands intertwined.
Aside from the bandwidth concerns of the logical problem of all the client data being funneled together to each of the admin clients. What sort of problems would arise from this design due to scaling issues(again neglecting bandwidth between clients and server; and admin clients and server.

There are a couple of basic approaches to doing this.
Worker threads or processes. Apache does this in most of its multiprocessing modes. In some versions of this, a thread or process is spawned for each request when the request arrives; in other versions, there's a pool of waiting threads which are assigned work as it arrives (avoiding the fork/thread create overhead when the request arrives).
Asynchronous (non-blocking) I/O and an event loop. This is basically using the UNIX select call (although both FreeBSD and Linux provide more optimized alternatives such as kqueue). lighttpd uses this approach and is able to achieve very high scalability, but any in-server computation blocks all other requests. Concurrent dynamic request handling is passed on to separate processes (via CGI) or waiting processes (via FastCGI or its equivalent).
I don't have any particular references handy to point you to, but if you look at the web sites for open source projects using the different approaches for information on their design wouldn't be a bad start.
In my experience, building a worker thread/process setup is easier when working from the ground up. If you have a good asynchronous framework that integrates fully with your other communications tasks (such as database queries), however, it can be very powerful and frees you from some (but not all) thread locking concerns. If you're working in Python, Twisted is one such framework. I've also been using Lwt for OCaml lately with good success.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart