What is the most common approach for designing large scale server programs?

What is the most common approach for designing large scale server programs? - scalability

Ok I know this is pretty broad, but let me narrow it down a bit. I've done a little bit of client-server programming but nothing that would need to handle more than just a couple clients at a time. So I was wondering design-wise what the most mainstream approach to these servers is. And if people could reference either tutorials, books, or ebooks.
Haha ok. didn't really narrow it down. I guess what I'm looking for is a simple but literal example of how the server side program is setup.
The way I see it: client sends command: server receives command and puts into queue, server has either a single dedicated thread or a thread pool that constantly polls this queue, then sends the appropriate response back to the client. Is non-blocking I/O often used?
I suppose just tutorials, time and practice are really what I need.
*EDIT: Thanks for your responses! Here is a little more of what I'm trying to do I suppose.
This is mainly for the purpose of learning so I'd rather steer away from use of frameworks or libraries as much as I can. Take for example this somewhat made up idea:
There is a client program it does some function and constantly streams the output to a server(there can be many of these clients), the server then creates statistics and stores most of the data. And lets say there is an admin client that can log into the server and if any clients are streaming data to the server it in turn would stream that data to each of the admin clients connected.
This is how I envision the server program logic:
The server would have 3 Threads for managing incoming connections(one for each port listening on) then spawning a thread to manage each connection:
1)ClientConnection which would basically just receive output, which we'll just say is text
2)AdminConnection which would be for sending commands between server and admin client
3)AdminDataConnection which would basically be for streaming client output to the admin client
When data comes in from a client to the server the server parses what is relevant and puts that data in a queue lets say adminDataQueue. In turn there is a Thread that watches this queue and every 200ms(or whatever) would check the queue to see if there is data, if there is, then cycle through the AdminDataConnections and send it to each.
Now for the AdminConnection, this would be for any commands or direct requests of data. So you could request for statistics, the server-side would receive the command for statistics then send a command saying incoming statistics, then immediately after that send a statistics object or data.
As for the AdminDataConnection, it is just the output from the clients with maybe a few simple commands intertwined.
Aside from the bandwidth concerns of the logical problem of all the client data being funneled together to each of the admin clients. What sort of problems would arise from this design due to scaling issues(again neglecting bandwidth between clients and server; and admin clients and server.

There are a couple of basic approaches to doing this.
Worker threads or processes. Apache does this in most of its multiprocessing modes. In some versions of this, a thread or process is spawned for each request when the request arrives; in other versions, there's a pool of waiting threads which are assigned work as it arrives (avoiding the fork/thread create overhead when the request arrives).
Asynchronous (non-blocking) I/O and an event loop. This is basically using the UNIX select call (although both FreeBSD and Linux provide more optimized alternatives such as kqueue). lighttpd uses this approach and is able to achieve very high scalability, but any in-server computation blocks all other requests. Concurrent dynamic request handling is passed on to separate processes (via CGI) or waiting processes (via FastCGI or its equivalent).
I don't have any particular references handy to point you to, but if you look at the web sites for open source projects using the different approaches for information on their design wouldn't be a bad start.
In my experience, building a worker thread/process setup is easier when working from the ground up. If you have a good asynchronous framework that integrates fully with your other communications tasks (such as database queries), however, it can be very powerful and frees you from some (but not all) thread locking concerns. If you're working in Python, Twisted is one such framework. I've also been using Lwt for OCaml lately with good success.

Related

How to use Kubernetes to do multiplayer online game with websocket?

If develop a online real time game with websocket, multiplayers running on the different containers, how to sync data when add or reduce containers if they are playing?
Does kubernetes has any good feature on this case?

ThatBrianDude already gave an awesome answer, and mine will not be that good. But I think your last comment gave us more hints about the architecture you have in mind. I hope my humble answer will shed a light on more ideas to your game. Here are some suggestions:
First, avoid keeping any state in the websocket apps.
The basic idea with containers is that they should be stateless.
ThatBrianDude
So, why not use caches and a messaging layer to help you with that. Imagine the following examples:
Situation 1: if the client sends an action to the websocket server, the server should put it in a queue/topic (some other service will process it later on).
Situation 2: The server might also listen to a(some) topic(s) for some types of messages, and send them back to the clients that need that information.
Situation 3: when the client asks for information or if the websocket server needs some information to send to the client, the server must read it from a cache, as reading from DB might be slow for a multiplayer game.
Situation 4: eventually a container is killed. The clients connected to that server will receive a connection error, and should reconnect. That means another handshake, and the player might feel it, depending on what the game was doing, so killing a container should not happen that often. But that would be just it, no information is lost.
This way, the websocket server containers are totally stateless, and the messaging topics and caches will help you to: provide all the information needed to containers, and; keep websockets, persistance and processing isolated and scalable.
Summing up, the information would flow like this:
clients are showering the websocket server containers with actions
websocket servers just send them to the messaging layer
processing containers (which can be scalled too!) receive those messages, process them, save to the database and/or to a cache and eventually send more messages to other topics
(optional) websocket servers receive those messages and send them to the clients.
Or like this:
clients ask for information or websocket servers periodically need to send the world state to clients
websocket servers look up the information in the cache
and send it to the clients.
Or even like this:
Some processing servers are independent of messages, they just read the game/world state (from the cache?) periodically
they process the physics and mechanics of the game
and save the result back in the cache, which will be sent to the clients by the websocket servers periodically, or send it in a topic so the websocket server can listen to it and send it to the clients.
Lastly, don't forget the suggestion to have one machine responsible for one game/world. It would be nice if each processing server (or each thread of a server) works with one game/world. That would make it easier to persist things without the need to sync stuff.

The basic idea with containers is that they should be stateless.
This means that any persistant data your game might have (highscores etc.) must be saved to a persistant DB whereas other temporary data like current ingame score or nickname etc. can stay inside the memory of the container and be gone once the container dies.
how to sync data when add or reduce containers if they are playing?
This sounds like you want to use multiple containers computing one game world?
Thats a whole other beast on its own but you might want to take a look at SpatialOS which pretty much allows for massive multiplayer worlds and is designed for games that require more than one machine per world.
If thats not what you are looking for I would recommend you to keep one machine responsible for one game/world as you will avoid high complexity when you try to sync stuff later on.

Cowboy web server - improve performance

Cowboy is webserver written in erlang. It spawns new process for each request and than using that process for subsequent requests if HTTP pipelining (sending multiple requests on same socket one after the other without waiting for the response and assuming that responses will be send back in same order as requests was sent) is used by client.
This is fine, but if you want to use that webserver for building realtime web app, it has one problem and that is when socket is closed for instance because of client network problems, the process representing that socket on the server is terminated. That means you can`t use that process for storing some session data (because in realtime web app you probably want to go behind the end of the http request (if long polling is used for instance) and have some state associated to the connected client and think about him as "he is online" even if the http request was ended.
In sock.js, it is solved by spawning one more process for each client (each session id).
So if you have 2000 clients using websockets, you will have around 4k processes (one process from cowboy that represents that socket and one more for keeping the session state alive for case that cowboy process will be terminated (for instance because of network problems).
THE QUESTION IS: i am relative new in erlang so i don`t know if it does make sense much in question of performance improvement, but i am thinking about rewriting that Cowboy webserver a bit so the process representing realtime connection will not ends until i want it (the process will be alive even when the underlying websocket socket will be terminated).
This will eliminate the needs to have one more session process for each client. So instead of 4000 processes you will have just 2000. Can it be huge performance booster in erlang?

Erlang is pretty good with processes, but, too much of anything ain't good. Using processes as direct mappings to sessions is not a good idea. Why not do it logically ? I assume you can have some IN-MEMORY storage, say, ETS, or even mnesia.
If am using Web Sockets to communicate, each user is connected via one such process, however, you simply map a certain random unique Session Key to each individual Process, hence to each individual user.
-record(client,{web_sock_pid, session_key,username}).
If the process exits, and the client end has a way pf reconnecting, once it re-identifies itself as the same user, then , the session key still holds, but the pid of the attached process has changed. it does not matter.
If it is NOT web sockets, and it is just HTTP REST/JSON/JSONP/XML services , then it is even very easy. Use ETS tables in RAM. A new session is stored and the parameters defining that session are store in RAM, then for each request, the session key can come along plus other parameters. Message delivery is by comet or frequent checks by the client end.

Sounds like you are doing some premature optimizations if you ask me.
Erlang processes are very inexpensive. You shouldn't really have to worry about spawning too manny processes.
Write it with two processes per websocket, then do some measurements to see where it is using the most memory and wasting the most cpu cycles.

Websocket scalability, broadcasting concerns

If you have a complex requirement set with many users(&servers) how will your websocket infrastructure (server[s]) will scale, especially with broadcasting?
Of course, broadcasting is not part of the any websocket spec but it's there even in basic chat examples (a.k.a. hello world for websocket).
Client side (asking for new data) solution still seems more scalable than server side (broadcasting) solution with websockets' low latency and relatively cheap (http headerless) nature.
Edit:
OK, just think that you want to replace all your ajax code with websocket implementations which may mean that so many connections within so many different contexts. This adds enormous complexity to your system if you want to keep track of every possible scenario for broadcasting.
Low (network/thread etc) level implementation suggestions are also part of the problem not the solution, because this means you have to code a special server unlike general http servers.
Moreover, broadcasting brings some sort of stateful nature to the table which can't easily scale. Think about adding more servers and load balancing.

Scaling realtime web solutions can be a complex problem but one that services like Pusher (who I work for) have solved, and one that there are most definitely solutions defined for self hosted realtime web solutions - the PubSub paradigm is well understood and has been solved many times and in order to solve the problem there needs to be some state (who is subscribing to what). This paradigm is used in broadcasting the the types of scenarios that you are talking about.
Realtime web technologies have been built with large amounts of simultaneous connections in mind - many from the ground up. If you wanted to create a scalable solution you would most likely use an existing realtime web server that supports WebSockets, in the same way that it's highly unlikely that you would decide to implement your own HTTP Server you are unlikely to want to implement your own server which supports WebSockets from scratch.
Dedicated Realtime web servers also let you separate your application logic from your realtime communication mechanism (separation of concerns). Your application might need to maintain some state but the realtime technology deals with managing subscriptions and connections. How communication between the application and the realtime web technology is achieved is up to you but frequently messages queues are used and specifically redis is very popular in this space.
HTTP polling may conceptually be easier to understand - you can maintain statelessness and with each HTTP poll request you specify exactly what you are looking for. But it most definitely means that you will need to start scaling much sooner (adding more resource to handle the load).
WebSocket polling is something I've not considered before and I don't think I've seen it suggested anywhere before either; the idea that the client should say "I'm ready for my next set of data and here's what I want" is an interesting one. WebSockets have generally taken a leap away from the request/response paradigm but there may be scenarios where the increased efficiency of WebSockets and request/response using them may have some benefits. The SocketStream application framework might be worth a look as it might be relevant; after the initial application load all communication is performed over WebSockets which means that event basic request/response functionality uses WebSockets.
However, since we are talking about broadcasting data we need to go back to the PubSub paradigm where it makes much more sense to have active subscriptions and when new data is available that new data is distributed to those active subscriptions (pushed). All your application needs to know is if there are any active subscriptions or not in order to decide whether to publish the data or not. That problem has been solved.

The idea of websockets is that you keep a persistent connection with each client. When there is new data that you want to send to every client, you already know who all the clients are so you should just send it.
It sound like you want each client to constantly be sending requests to the server for new data. Why? It seems like that would waste everyone's bandwidth and I don't know why you think it will be more scalable. Maybe you could add more detail to your question like what kind of information you are broadcasting, how often, how many bytes, how many clients, etc.
Why not just consider an open websocket connection to be like a standing request from the client for more data?

Erlang web-distribution

On non-web based chat system the server distinguishes its clients by their PIDs, right? And what should be used to distinguish the clients on web-based chat system?
Thnx in advance

The fact that you're using a web server shouldn't change much about your model. You're still building chat. You also don't want to make your chats tied too deeply to the process that is managing their HTTP connection. HTTP connections are ephemeral, even if everything is going well and you're using long polling there's no guarantee that the connection will be re-used with Keep-Alive for the next long poll. The user might also want to open up the same chat in multiple browser windows, multiple computers, whatever.
I haven't looked closely at any of these but you're not the first person that has built web chat with Erlang:
http://chrismoos.com/2009/09/28/building-an-erlang-chat-server-with-comet-part-1/
http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf
http://yoan.dosimple.ch/blog/2008/05/15/
https://github.com/yrashk/socket.io-erlang (more of a general tool for this sort of thing, not chat specifically)
https://github.com/rvirding/chat_demo (as seen above)

I think the confusion comes from the notion that a Erlang server process must stay alive for every individual client. It can, but Mochiweb doesn't do that by default if I'm not mistaken. It just spawns a new process for every request. If you would like to have a long lived bidirectional client <-> server process connection you can do that for example by;
sending a client identifier with every request and map that to a long-lived process on the server. The process will maintain servers state and you can call methods on it. It's still pull and not push though.
use the web socket implementations. Not sure if Mochiweb has one, but other Erlang HTTP servers like Misultin and Yaws provide one. For a web based chat system I believe web sockets would be a great fit.

For a very trivial example of a web-based chat system using websockets and Misultin you can check out this chat demo. It was written to demonstrate an idea and is not very elegant, but it does work.

What do a benefit from changing from blocking to non-blocking sockets?

We have an application server developed with Delphi 2010 and Indy 10. This server receives more than 50 requests per second and it works well. But in some cases, it seems to me that Indy is very obscure. Their components are good, but sometimes I found myself digging into the source code only to understand a simple thing. Indy lacks on good documentation and good support.
The last thing that i came across was a big problem for me: I must detect when a client disconnects non gracefully (When the the client crashes or shutdown, for instance. Not telling the server that it will disconnect) and indy was not able to do that. If I want that, I will have to develop a algorithm like heartbeat, pooling or TCP keep-alive. I do not want to spend more time doing a, at least I think, component job. After a few study, I found out that this is not Indy's fault, but this is an issue of all blocking sockets components.
Now I am really thinking of changing the core of the Server to another good suite. I must admit I am tending to use a non-blocking socket. Based on that, I have some questions:
What do a benefit from changing from blocking to non-blocking sockets?
Will I be able to detect client disconnects (non gracefully)?
What component suite has the best product? By best product I mean: fast, good support, good tools and easy to implement.
I know this must be a subjective question, but I really want to hear that from you. My first question is the one I care most. I do not care if I have to pay 100, 500, 1000, 10000 dollars, but I want a complete solution. For now, I am thinking about Ip*works .
EDIT
I think some guys are not understand what I want. I don't want to create my own socket. I have been working with sockets for a long time and I am getting tired of it. Really.
And non-blocking sockets CAN detect client disconnects. That is a fact and it has good documentation all over the internet. A non-blocking socket checks the socket state for new incoming data all the time, and it makes possible to detect that the socket is not valid. This is not a heartbeat algorithm. A heartbeat algorithm is used on client side and it sends periodically packets (aka keep-alive) to the server to tells it is still alive.
EDIT
I am not make myself clear. Maybe because English is not my main language. I am not saying that it is possible to detect a dropped connection without trying to send or receiving data from a socket. What I am saying is that every non-blocking socket is able to do that because they constantly tries to read from the socket for new incoming data. Why is that so hard to understand? If you guys download and run ip*works demos, in special, the echoserver and echoclient ones (both use TCP) you can test by yourselves. I already tested it, and it works like I expected to do. Even if you use the old TCPSocketServer and TCPSocketClient in a non-blocking mode you will see what I meant.

"What do a benefit from changing from blocking to non-blocking sockets? Will I be able to detect client disconnects (non gracefully)?"
Just my two cents to get the ball rolling on this question - I'm not a socket EXPERT, but I do have a good deal of experience with them. If I'm mistaken, I'm sure someone will correct me... :-)
I assume that since you're running a server using blocking sockets with 50 connections per second, you have a threading mechanism in place to handle client requests. If so, you don't really stand to gain anything from non-blocking sockets. On the contrary - you will have to change your server logic to be event driven- based on events fired in your main thread from the non-blocking sockets, or use constant polling to know what your sockets are up to.
Non-blocking sockets can't detect clients disconnecting without notification any more than blocking sockets can - they don't have telepathic powers... The nature of the TCP/IP 'conversation' between client and server is the same - blocking and non-blocking is only with respect to your application's interaction with the socket connection conducting the 'conversation'.
If you need to purge dead connections, you need to implement a heartbeat or timeout mechanism on your socket (I've never seen a modern socket implementation that didn't support timeouts).

What do a benefit from changing from blocking to non-blocking sockets?
Increased speed, availability, and throughput (from my experience). I had an IndySockets client that was getting about 15 requests per second and when I went directly to asynchronous sockets the throughput increased to about 90 requests per second (on the same machine). In a separate benchmark test on a server at a data-center with a 30 Mbit connection I was able to get more than 300 requests per second.
Will I be able to detect client disconnects (non gracefully)?
That's one thing I haven't had to try yet, since all of my code has been on the client side.
What component suite has the best product? By best product I mean: fast, good support, good tools and easy to implement.
You can build your own socket client in a couple of days and it can be very robust and fast... much faster than most of the stuff I've seen "off the shelf". Feel free to take a look at my asynchronous socket client: http://codesprout.blogspot.com/2011/04/asynchronous-http-client.html
Update:
(Per Mikey's comments)
I'm asking you for a generic, technical explanation of how NBS increase throughput as opposed to a properly designed BS server.
Let's take a high load server as an example: say your server is supposed to handle 1000 connections at any given time, with blocking sockets you would have to create 1000 threads and even if they're mostly idle, the CPU will still spend a lot of time context switching. As the number of clients increases you will have to increase the number of threads in order to keep up and the CPU will inevitably increase the context switching. For every connection you establish with a blocking socket, you will incur the overhead of spawning of a new thread and you eventually you will incur the overhead of cleaning up after the thread. Of course, the first thing that comes to mind is: why not use the ThreadPool, you can reuse the threads and reduce the overhead of creating/cleaning-up of threads.
Here is how this is handled on Windows (hence the .NET connection): sure you could, but the first thing you'll notice with the .NET ThreadPool is that it has two types of threads and it's not a coincidence: user threads and I/O completion port threads. Asynchronous sockets use the IO completion ports which "allows a single thread to perform simultaneous I/O operations on different handles, or even simultaneous read and write operations on the same handle."(1) The I/O completion port threads are specifically designed to handle I/O in a much more efficient way than you would ever be able to achieve if you used the user threads in ThreadPool, unless you wrote your own kernel-mode driver.
"The completion port uses some special voodoo to make sure only a specific number of threads can run at once — if one thread blocks in kernel-mode, it will automatically start up another one."(2)
There are other advantages also: "in addition to the nonblocking advantage of the overlapped socket I/O, the other advantage is better performance because you save a buffer copy between the TCP stack buffer and the user buffer for each I/O call." (3)

I am using Indy and Synapse TCP libraries with good results for some years now, and did not find any showstoppers in them. I use the libraries in threads - client and server side, stability and performance was not a problem. (Six thousand request and response messages per second and more with the server running on the same system are typical.)
Blocking sockets are very useful if the protocol is more advanced than a simple 'send a string / receive a string'. Non-blocking sockets cause a higher coupling of message protocol handlers with the socket read / write logic, so I quickly moved away from non-blocking code.
No library can overcome the limitations of the TCP/IP protocol regarding detection of connection loss. Only trying to read or send data can tell wether the connection is still present.

In Windows, there is a third option which is overlapped I/O. Non-blocking sockets are essential a model using Windows messages developed to avoid single-threaded GUI apps to become "blocked" while waiting for data. A modern application IMHO would be better designed using threads and overlapped I/O.
See for example http://support.microsoft.com/kb/181611

Aahhrrgghh - the myth of being able to always detect "dropped" connections. If you pull the power on a machine with a client connection then the server cannot tell, without sending data, that the connection is "dead". The is through the design of the TCP protocol. Don't take my word for it - read this article (Detection of Half-Open (Dropped) TCP/IP Socket Connections).

This article explains the main differences between blocking and non-blocking:
Introduction to Indy, by Chad Z. Hower
Pros of Blocking
Easy to program - Blocking is very easy to program. All user code can
exist in one place, and in a
sequential order.
Easy to port to Unix - Since Unix uses blocking sockets, portable code
can be written easily. Indy uses this
fact to achieve its single source
solution.
Work well in threads - Since blocking sockets are sequential they
are inherently encapsulated and
therefore very easily used in threads.
Cons of Blocking
User Interface "Freeze" with clients - Blocking socket calls do not
return until they have accomplished
their task. When such calls are made
in the main thread of an application,
the application cannot process the
user interface messages. This causes
the User Interface to "freeze" because
the update, repaint and other messages
cannot be processed until the blocking
socket calls return control to the
applications message processing loop.
He also wrote:
Blocking is NOT Evil
Blocking sockets have been repeatedly
attacked with out warrant. Contrary to
popular belief, blocking sockets are
not evil.
It is not is an issue of all blocking sockets components that they are unable to detect a client disconnect. There is no technical advantage on the side of non-blocking components in this area.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart