Local multithreaded server: how make it simple? - pthreads

There is a task - i need a client-server apps with properties below:
client uses blocking sockets
server is passive: he accepts incoming connections, gets some data, processes it and sends results back
server has several threads (except main), from 1 to about 100
all actions is onto one machine, i.e. both client and server use localhost
max size of data transfered is about 1 KB
At the beginning I've implemented epoll-based server:
main thread does IO using epoll, worker threads do data processing
all server sockets are non-blocking
server has two queues: incoming where main thread puts client request, outcoming from where main thread gets replies.
I's works but then I thought about that design again: it doesn't take into account localhost. It's more universal and therefore more complicated. It would be nice to do it simpler than it is.
So:
main thread does epoll_wait, accept and manages a connection queue. The connection queue is a queue where main thread puts accepted connections
listening socket is non-blocking. accepted sockets are blocking
when any socket becomes readable main thread wakes up worker thread, worker does recv in blocking mode, processing and send in blocking mode.
What do you think about it? Is that design good enough?
Thanks for your comments!

Related

queue and flow control with delphi using indy

i have a client server application (TCP) that's designed with indy delphi.
i want to have a queue and flow control in my server side application.
my server should not lose any clients data when server traffic is full.
for example , in my server side application i want determine maximum of bandwidth for server is 10Mbps and then if server bandwidth (this 10Mbps) was full the other clients be on queue until bandwidth get free .
so i want to know how can i design this with delphi ?
thanks
best regard
The client should not send the message directly to the server. Put the message in a local store (f.i. sqlite-db) and in a thread you read the first message from the local store and try to send it to the server.
If the message was delivered to the server (no exception raised) delete the message from the local store and process the next "first" message in the local store.
Within the TIdTCPServer.OnExecute method which receives the client data, it is possible to 'delay' processing of the incoming request with a simple Sleep command. The client data will stay in the TCP socket until the Sleep command finished.
If your server keeps track of the current 'global' bandwidth usage for all clients, it is possible to set the Sleep time dynamically. You could even set different priorities for different clients.
So you would need a simple but thread safe bandwidth usage monitor, an algorithm which calculates sensible Sleep time values, and a way to assign this Sleep time to the individual client connection contexts.
See also:
https://en.wikipedia.org/wiki/Token_bucket
https://github.com/bandwidth-throttle/token-bucket for an example implementation in PHP
http://www.nurkiewicz.com/2011/03/tenfold-increase-in-server-throughput.html

Cowboy web server - improve performance

Cowboy is webserver written in erlang. It spawns new process for each request and than using that process for subsequent requests if HTTP pipelining (sending multiple requests on same socket one after the other without waiting for the response and assuming that responses will be send back in same order as requests was sent) is used by client.
This is fine, but if you want to use that webserver for building realtime web app, it has one problem and that is when socket is closed for instance because of client network problems, the process representing that socket on the server is terminated. That means you can`t use that process for storing some session data (because in realtime web app you probably want to go behind the end of the http request (if long polling is used for instance) and have some state associated to the connected client and think about him as "he is online" even if the http request was ended.
In sock.js, it is solved by spawning one more process for each client (each session id).
So if you have 2000 clients using websockets, you will have around 4k processes (one process from cowboy that represents that socket and one more for keeping the session state alive for case that cowboy process will be terminated (for instance because of network problems).
THE QUESTION IS: i am relative new in erlang so i don`t know if it does make sense much in question of performance improvement, but i am thinking about rewriting that Cowboy webserver a bit so the process representing realtime connection will not ends until i want it (the process will be alive even when the underlying websocket socket will be terminated).
This will eliminate the needs to have one more session process for each client. So instead of 4000 processes you will have just 2000. Can it be huge performance booster in erlang?
Erlang is pretty good with processes, but, too much of anything ain't good. Using processes as direct mappings to sessions is not a good idea. Why not do it logically ? I assume you can have some IN-MEMORY storage, say, ETS, or even mnesia.
If am using Web Sockets to communicate, each user is connected via one such process, however, you simply map a certain random unique Session Key to each individual Process, hence to each individual user.
-record(client,{web_sock_pid, session_key,username}).
If the process exits, and the client end has a way pf reconnecting, once it re-identifies itself as the same user, then , the session key still holds, but the pid of the attached process has changed. it does not matter.
If it is NOT web sockets, and it is just HTTP REST/JSON/JSONP/XML services , then it is even very easy. Use ETS tables in RAM. A new session is stored and the parameters defining that session are store in RAM, then for each request, the session key can come along plus other parameters. Message delivery is by comet or frequent checks by the client end.
Sounds like you are doing some premature optimizations if you ask me.
Erlang processes are very inexpensive. You shouldn't really have to worry about spawning too manny processes.
Write it with two processes per websocket, then do some measurements to see where it is using the most memory and wasting the most cpu cycles.

How do I keep Advantage Database connections from timing out?

I have a Windows Service that works with an advantage database and occasionally makes some http calls. On rare occasions these calls can be very long. To the tune that my database connection times out. I'm not using a Data Module or anything. Just creating the connection manually.
My primary question is what usually prevents the connection from timing out if I just haven't used it in a while? Do the TAdsComponents send a keep alive message that gets called in the background somehow? Is that dependent on the vcl so I don't have that in my service? Somehow I feel like creating a thread to make my http call, and in the main thread checking for it to finish every few seconds would prevent the connection from dying. Is that ever true?
Yes, there is a keepalive mechanism as you expect. The client (for all communication types, TCP, UDP, Shared memory) sends a "ping" to the server every so often to let the server know that connection is still alive. The frequency of that keepalive ping is based on the server configuration parameter CLIENT_TIMEOUT. With the default settings, I believe the keepalive ping is sent every 30 seconds.
The keepalive logic runs in a separate thread that is started by the code that handles the communication. In other words, it does not depend on any of the VCL components; if you have a connection to the server, then that thread should be running.
One way to check if your connections are timing out is to look in the Advantage error log. There should be 7020 errors corresponding to timed out connections.
Some things that come to mind that might result in timed out connections include:
The client process being suspended for some reason so that the keepalive thread could not run. This seems unlikely.
The keepalive thread was killed for some reason. This also seems unlikely; you would have to go out of your way to make this happen.
A firewall may close the connection if there is no activity for a time. I would think, though, that a 30 second interval would be sufficient to prevent that.
A firewall may disallow the UDP keepalive packets. Firewalls, by nature, are "suspicious" of UDP packets. You might make sure you are using TCP/IP.

What do a benefit from changing from blocking to non-blocking sockets?

We have an application server developed with Delphi 2010 and Indy 10. This server receives more than 50 requests per second and it works well. But in some cases, it seems to me that Indy is very obscure. Their components are good, but sometimes I found myself digging into the source code only to understand a simple thing. Indy lacks on good documentation and good support.
The last thing that i came across was a big problem for me: I must detect when a client disconnects non gracefully (When the the client crashes or shutdown, for instance. Not telling the server that it will disconnect) and indy was not able to do that. If I want that, I will have to develop a algorithm like heartbeat, pooling or TCP keep-alive. I do not want to spend more time doing a, at least I think, component job. After a few study, I found out that this is not Indy's fault, but this is an issue of all blocking sockets components.
Now I am really thinking of changing the core of the Server to another good suite. I must admit I am tending to use a non-blocking socket. Based on that, I have some questions:
What do a benefit from changing from blocking to non-blocking sockets?
Will I be able to detect client disconnects (non gracefully)?
What component suite has the best product? By best product I mean: fast, good support, good tools and easy to implement.
I know this must be a subjective question, but I really want to hear that from you. My first question is the one I care most. I do not care if I have to pay 100, 500, 1000, 10000 dollars, but I want a complete solution. For now, I am thinking about Ip*works .
EDIT
I think some guys are not understand what I want. I don't want to create my own socket. I have been working with sockets for a long time and I am getting tired of it. Really.
And non-blocking sockets CAN detect client disconnects. That is a fact and it has good documentation all over the internet. A non-blocking socket checks the socket state for new incoming data all the time, and it makes possible to detect that the socket is not valid. This is not a heartbeat algorithm. A heartbeat algorithm is used on client side and it sends periodically packets (aka keep-alive) to the server to tells it is still alive.
EDIT
I am not make myself clear. Maybe because English is not my main language. I am not saying that it is possible to detect a dropped connection without trying to send or receiving data from a socket. What I am saying is that every non-blocking socket is able to do that because they constantly tries to read from the socket for new incoming data. Why is that so hard to understand? If you guys download and run ip*works demos, in special, the echoserver and echoclient ones (both use TCP) you can test by yourselves. I already tested it, and it works like I expected to do. Even if you use the old TCPSocketServer and TCPSocketClient in a non-blocking mode you will see what I meant.
"What do a benefit from changing from blocking to non-blocking sockets? Will I be able to detect client disconnects (non gracefully)?"
Just my two cents to get the ball rolling on this question - I'm not a socket EXPERT, but I do have a good deal of experience with them. If I'm mistaken, I'm sure someone will correct me... :-)
I assume that since you're running a server using blocking sockets with 50 connections per second, you have a threading mechanism in place to handle client requests. If so, you don't really stand to gain anything from non-blocking sockets. On the contrary - you will have to change your server logic to be event driven- based on events fired in your main thread from the non-blocking sockets, or use constant polling to know what your sockets are up to.
Non-blocking sockets can't detect clients disconnecting without notification any more than blocking sockets can - they don't have telepathic powers... The nature of the TCP/IP 'conversation' between client and server is the same - blocking and non-blocking is only with respect to your application's interaction with the socket connection conducting the 'conversation'.
If you need to purge dead connections, you need to implement a heartbeat or timeout mechanism on your socket (I've never seen a modern socket implementation that didn't support timeouts).
What do a benefit from changing from blocking to non-blocking sockets?
Increased speed, availability, and throughput (from my experience). I had an IndySockets client that was getting about 15 requests per second and when I went directly to asynchronous sockets the throughput increased to about 90 requests per second (on the same machine). In a separate benchmark test on a server at a data-center with a 30 Mbit connection I was able to get more than 300 requests per second.
Will I be able to detect client disconnects (non gracefully)?
That's one thing I haven't had to try yet, since all of my code has been on the client side.
What component suite has the best product? By best product I mean: fast, good support, good tools and easy to implement.
You can build your own socket client in a couple of days and it can be very robust and fast... much faster than most of the stuff I've seen "off the shelf". Feel free to take a look at my asynchronous socket client: http://codesprout.blogspot.com/2011/04/asynchronous-http-client.html
Update:
(Per Mikey's comments)
I'm asking you for a generic, technical explanation of how NBS increase throughput as opposed to a properly designed BS server.
Let's take a high load server as an example: say your server is supposed to handle 1000 connections at any given time, with blocking sockets you would have to create 1000 threads and even if they're mostly idle, the CPU will still spend a lot of time context switching. As the number of clients increases you will have to increase the number of threads in order to keep up and the CPU will inevitably increase the context switching. For every connection you establish with a blocking socket, you will incur the overhead of spawning of a new thread and you eventually you will incur the overhead of cleaning up after the thread. Of course, the first thing that comes to mind is: why not use the ThreadPool, you can reuse the threads and reduce the overhead of creating/cleaning-up of threads.
Here is how this is handled on Windows (hence the .NET connection): sure you could, but the first thing you'll notice with the .NET ThreadPool is that it has two types of threads and it's not a coincidence: user threads and I/O completion port threads. Asynchronous sockets use the IO completion ports which "allows a single thread to perform simultaneous I/O operations on different handles, or even simultaneous read and write operations on the same handle."(1) The I/O completion port threads are specifically designed to handle I/O in a much more efficient way than you would ever be able to achieve if you used the user threads in ThreadPool, unless you wrote your own kernel-mode driver.
"The com­ple­tion port uses some spe­cial voodoo to make sure only a spe­cif­ic num­ber of threads can run at once — if one thread blocks in ker­nel-​mode, it will au­to­mat­i­cal­ly start up an­oth­er one."(2)
There are other advantages also: "in addition to the nonblocking advantage of the overlapped socket I/O, the other advantage is better performance because you save a buffer copy between the TCP stack buffer and the user buffer for each I/O call." (3)
I am using Indy and Synapse TCP libraries with good results for some years now, and did not find any showstoppers in them. I use the libraries in threads - client and server side, stability and performance was not a problem. (Six thousand request and response messages per second and more with the server running on the same system are typical.)
Blocking sockets are very useful if the protocol is more advanced than a simple 'send a string / receive a string'. Non-blocking sockets cause a higher coupling of message protocol handlers with the socket read / write logic, so I quickly moved away from non-blocking code.
No library can overcome the limitations of the TCP/IP protocol regarding detection of connection loss. Only trying to read or send data can tell wether the connection is still present.
In Windows, there is a third option which is overlapped I/O. Non-blocking sockets are essential a model using Windows messages developed to avoid single-threaded GUI apps to become "blocked" while waiting for data. A modern application IMHO would be better designed using threads and overlapped I/O.
See for example http://support.microsoft.com/kb/181611
Aahhrrgghh - the myth of being able to always detect "dropped" connections. If you pull the power on a machine with a client connection then the server cannot tell, without sending data, that the connection is "dead". The is through the design of the TCP protocol. Don't take my word for it - read this article (Detection of Half-Open (Dropped) TCP/IP Socket Connections).
This article explains the main differences between blocking and non-blocking:
Introduction to Indy, by Chad Z. Hower
Pros of Blocking
Easy to program - Blocking is very easy to program. All user code can
exist in one place, and in a
sequential order.
Easy to port to Unix - Since Unix uses blocking sockets, portable code
can be written easily. Indy uses this
fact to achieve its single source
solution.
Work well in threads - Since blocking sockets are sequential they
are inherently encapsulated and
therefore very easily used in threads.
Cons of Blocking
User Interface "Freeze" with clients - Blocking socket calls do not
return until they have accomplished
their task. When such calls are made
in the main thread of an application,
the application cannot process the
user interface messages. This causes
the User Interface to "freeze" because
the update, repaint and other messages
cannot be processed until the blocking
socket calls return control to the
applications message processing loop.
He also wrote:
Blocking is NOT Evil
Blocking sockets have been repeatedly
attacked with out warrant. Contrary to
popular belief, blocking sockets are
not evil.
It is not is an issue of all blocking sockets components that they are unable to detect a client disconnect. There is no technical advantage on the side of non-blocking components in this area.

Erlang e Thrift

I want to make a Windows Service using Erlang and Thrift.
The service will have a single thread listening in a port (socket communication) and send request to a worker's thread. The Windows Service have to response quickily (milisencods) and the throughput is mandatory. (requests per second)
The workers thread will communicate each other. I think in Earlang to resolve this issue.
So i think erlang+thrift will work good. Am I right? Any suggestions?
Your solution is reasonable. To bring you up to speed i would suggest reading up on gen_server, supervisor, application.
Thrift will generate stub files which by compiling will yield you a transport/acceptor. It's up to you to provide both the thrift api and the handler for this api.
Moreover be advised not to synchronize alot between processes if you need fast response times (ie. dont design your solution around synchronizing calls)

Resources