Are there some general Network programming best practices? - network-programming

I am implementing some networking stuff in our project. It has been decided that the communication is very important and we want to do it synchronously. So the client sends something the server acknowledges.
Are there some general best practices for the interaction between the client and the server. For instance if there isn't an answer from the server should the client automatically retry? Should there be a timeout period before it retries? What happens if the acknowledgement fails? At what point do we break the connection and reconnect? Is there some material? I have done searches but nothing is really coming up.
I am looking for best practices in general. I am implementing this in c# (probably with sockets) so if there is anything .Net specific then please let me know too.

First rule of networking - you are sending messages, you are not calling functions.
If you approach networking that way, and don't pretend that you can call functions remotely or have "remote objects", you'll be fine. You never have an actual "thing" on the other side of the network connection - what you have is basically a picture of that thing.
Everything you get from the network is old data. You are never up to date. Because of this, you need to make sure that your messages carry the correct semantics - for instance, you may increment or decrement something by a value, you should not set its value to the current value plus or minus another (as the current value may change by the time your message gets there).

If both the client and the server are written in .NET/C# I would recommend WCF insted of raw sockets as it saves you a from a lot of plumbing code with serialization and deserialization, synchronization of messages etc.
That maybe doesn't really answer your question about best practices though ;-)

The first thing to do is to characterize your specific network in terms of speed, probability of lost messages, nominal and peak traffic, bottlenecks, client and server MTBF, ...
Then and only then you decide what you need for your protocol. In many cases you don't need sophisticated error-handling mechanisms and can reliably implement a service with plain UDP.
In few cases, you will need to build something much more robust in order to maintain a consistent global state among several machines connected through a network that you cannot trust.

The most important thing I found is that messages always should be stateless (read up on REST if this means nothing to you)
For example if your application monitors the number of shipments over a network do not send incremental updates (+x) but always the new total.

In a common think about network programming, I think you should learn about :
1. Socket (of course).
2. Fork and Threading.
3. Locking process (use mutex or semaphore or others).
Hope this help..

Related

Distributing an Erlang Chat system

I just finished Erlang in Practice screencasts (code here), and have some questions about distribution.
Here's the is overall architecture:
Here is how to the supervision tree looks like:
Reading Distributed Applications leads me to believe that one of the primary motivations is for failover/takeover.
However, is it possible, for example, the Message Router supervisor and its workers to be on one node, and the rest of the system to be on another, without much changes to the code?
Or should there be 3 different OTP applications?
Also, how can this system be made to scale horizontally? For example if I realize now that my system can handle 100 users, and that I've identified the Message Router as the main bottleneck, how can I 'just add another node' where now it can handle 200 users?
I've developed Erlang apps only during my studies, but generally we had many small processes doing only one thing and sending messages to other processes. And the beauty of Erlang is that it doesn't matter if you send a message within the same Erlang VM or withing the same Computer, same LAN or over the Internet, the call and the pointer to the other process looks always the same for the developer.
So you really want to have one application for every small part of the system.
That being said, it doesn't make it any simpler to construct an application which can scale out. A rule of thumb says that if you want an application to work on a factor of 10-times more nodes, you need to rewrite, since otherwise the messaging overhead would be too large. And obviously when you start from 1 to 2 you also need to consider it.
So if you found a bottleneck, the application which is particularly slow when handling too many clients, you want to run it a second time and than you need to have some additional load-balancing implemented, already before you start the second application.
Let's assume the supervisor checks the message content for inappropriate content and therefore is slow. In this case the node, everyone is talking to would be simple router application which would forward the messages to different instances of the supervisor application, in a round robin manner. In case those 1 or 2 instances are not enough, you could have the router written in a way, that you can manipulate the number of instances by sending controlling messages.
However for this, to work automatically, you would need to have another process monitoring the servers and discovering that they are overloaded or under utilized.
I know that dynamically adding and removing resources always sounds great when you hear about it, but as you can see it is a lot of work and you need to have some messaging system built which allows it, as well as a monitoring system which can monitor the need.
Hope this gives you some idea of how it could be done, unfortunately it's been over a year since I wrote my last Erlang application, and I didn't want to provide code which would be possibly wrong.

Large number of WebSocket connections

I am writing an application that keeps track of content pushed around between users of a certain task. I am thinking of using WebSockets to send down new content as they are available to all users who are currently using the app for that given task.
I am writing this on Rails and the client side app is on iOS (probably going to be in Android too). I'm afraid that this WebSocket solution might not scale well. I am after some advice and things to consider while making the decision to go with WebSockets vs. some kind of polling solution.
Would Ruby on Rails servers (like Heroku) support large number of WebSockets open at the same time? Let's say a million connections for argument sake. Any material anyone can provide me of such stuff?
Would it cost a lot more on server hosting if I architect it this way?
Is it even possible to maintain millions of WebSockets simultaneously? I feel like this may not be the best design decision.
This is my first try at a proper Rails API. Any advice is greatly appreciated. Thx.
Million connections over WebSockets, using Ruby, I can't see its real if you not using clustering to spread connections between different instances to handle all the data processing.
The problem here is serializing and deserializing data.
As well you have to research of how often you will need to pull data to client from server, and if it worth to have just periodical checks using AJAX, then handling connection for whole time. Because if you do handle connection and then you not using it - it is waste of resources. WebSockets are build on top of TCP layer, and all connections are not "cheap" as well going through for OS and asking them for data available again is not the simple process, with millions connections it is something really almost impossible without using most advanced technologies in the world.
I head that Erlang is able to handle millions of connections, but I don't have details over it. As well connection is one thing, another is processing data and interaction between connections - this you might want to check, because if you have heavy processing algorithms, then you definitely need to look into horizontal scaling options over clustering solutions.
If you are implementing chat, use websockets.
If you are implementing 1 way messages in realtime use server sent events.
If you are implementing 1 way messages sent every few hours or so, use APNS.
The saying goes phone in hand, use websockets / server sent events.
Phone in pocket, use APNS.
APNS will alleviate wifi dips, tcp/ip socket hangs and many other issues. Really useful. There is the chance that it may take a little time to get through. But then again, there is the chance that websockets will take
Recent versions of iOS let you send APNS to the client without a popup message to the client so it can ask the server for more information. That along with some backgrounding implementations really improves things.
If possible, do not implement totally anonymous clients. It is very tricky to detect if a client reinstalls the app. So you'll end up sending duplicates to the client. Need to take that into account.
APNS looks trivial to implement in ruby, but I'd suggest avoiding the urge and going to using an existing gem/service out there that supports both google and apple. It is much trickier to implement than it may seem at first.
If you decide to stick with websockets, it may make sense to just leverage websockets in nginx like https://github.com/wandenberg/nginx-push-stream-module
ASIDE:
Using SMS where speed is critical is very expensive. $1/month per phone number only sends a max rate of 1 message per second. So sending 100 messages per second = $100/month plus message fees. Do note that 100 messages at a rate of 50 messages/second = $50/month. But if you want to send 1k messages, that takes 20 seconds.
Good luck

What is the performance overhead of Apache ActiveMQ vs. raw sockets?

We're looking to implement ActiveMQ to handle messaging between two of our servers, over a geographically diverse environment (Australia to the UK and back, via the internet).
I've been looking for some vague indicators of performance round the net but so far have had no luck.
My question: compared to a DIY TCP/SSL implementation of basic messaging, how would ActiveMQ perform? Similar systems of our own can send and receive messages across Australia in 100-150ms, over a SSL layer with an already established connection.
Also, does ActiveMQ persist its TLS/SSL connections, thus saving a substantial amount of time that would already be used in connection creation/teardown?
What I am hoping is that it will at least perform better than HTTPS, at a per-request level.
I am aware that performance can vary remarkably, depending on hardware, networks, code and so on. I'm just after something to start with.
I know the above is a little fuzzy - if you need any clarification please let me know and I will only be too happy to oblige.
Thank you.
What Tim means is that this is not an apples to apples comparison. If you are solely concerned with the performance of a single point to point connection to transfer data, a direct link will give you a good result (although DIY is still a dubious design decision). If you are building a system that requires the transfer of data and you have more complex functional requirements, then a broker-based messaging platform like ActiveMQ will come into play.
You should consider broker-based messaging if you want:
a post-office style system where a producer sends a message, and knows that it will be consumed at some point, even if there is no consumer there at that time
to not care where the consumer of a message is, or how many of them there are
a guarantee that a message will be consumed, even if the consumer that first handle it dies mid-way through the process (transactions, redelivery)
many consumers, with a guarantee that a message will only be consumed once - queues
many consumers that will each react to a single message - topics
These patterns are pretty standard, and apply to all off the shelf messaging products. As a general rule, DIY in this domain is a bad idea, as messaging is complex (see http://www.ohloh.net/p/activemq/estimated_cost for an estimate of how long it would take you do do same); and has many existing implementations of various flavours (some without a broker) that are all well used, commercially supported and don't require you to maintain them. I would think very hard before going down to the TCP level for any sort of data transfer as there is so much prior art.

Websocket scalability, broadcasting concerns

If you have a complex requirement set with many users(&servers) how will your websocket infrastructure (server[s]) will scale, especially with broadcasting?
Of course, broadcasting is not part of the any websocket spec but it's there even in basic chat examples (a.k.a. hello world for websocket).
Client side (asking for new data) solution still seems more scalable than server side (broadcasting) solution with websockets' low latency and relatively cheap (http headerless) nature.
Edit:
OK, just think that you want to replace all your ajax code with websocket implementations which may mean that so many connections within so many different contexts. This adds enormous complexity to your system if you want to keep track of every possible scenario for broadcasting.
Low (network/thread etc) level implementation suggestions are also part of the problem not the solution, because this means you have to code a special server unlike general http servers.
Moreover, broadcasting brings some sort of stateful nature to the table which can't easily scale. Think about adding more servers and load balancing.
Scaling realtime web solutions can be a complex problem but one that services like Pusher (who I work for) have solved, and one that there are most definitely solutions defined for self hosted realtime web solutions - the PubSub paradigm is well understood and has been solved many times and in order to solve the problem there needs to be some state (who is subscribing to what). This paradigm is used in broadcasting the the types of scenarios that you are talking about.
Realtime web technologies have been built with large amounts of simultaneous connections in mind - many from the ground up. If you wanted to create a scalable solution you would most likely use an existing realtime web server that supports WebSockets, in the same way that it's highly unlikely that you would decide to implement your own HTTP Server you are unlikely to want to implement your own server which supports WebSockets from scratch.
Dedicated Realtime web servers also let you separate your application logic from your realtime communication mechanism (separation of concerns). Your application might need to maintain some state but the realtime technology deals with managing subscriptions and connections. How communication between the application and the realtime web technology is achieved is up to you but frequently messages queues are used and specifically redis is very popular in this space.
HTTP polling may conceptually be easier to understand - you can maintain statelessness and with each HTTP poll request you specify exactly what you are looking for. But it most definitely means that you will need to start scaling much sooner (adding more resource to handle the load).
WebSocket polling is something I've not considered before and I don't think I've seen it suggested anywhere before either; the idea that the client should say "I'm ready for my next set of data and here's what I want" is an interesting one. WebSockets have generally taken a leap away from the request/response paradigm but there may be scenarios where the increased efficiency of WebSockets and request/response using them may have some benefits. The SocketStream application framework might be worth a look as it might be relevant; after the initial application load all communication is performed over WebSockets which means that event basic request/response functionality uses WebSockets.
However, since we are talking about broadcasting data we need to go back to the PubSub paradigm where it makes much more sense to have active subscriptions and when new data is available that new data is distributed to those active subscriptions (pushed). All your application needs to know is if there are any active subscriptions or not in order to decide whether to publish the data or not. That problem has been solved.
The idea of websockets is that you keep a persistent connection with each client. When there is new data that you want to send to every client, you already know who all the clients are so you should just send it.
It sound like you want each client to constantly be sending requests to the server for new data. Why? It seems like that would waste everyone's bandwidth and I don't know why you think it will be more scalable. Maybe you could add more detail to your question like what kind of information you are broadcasting, how often, how many bytes, how many clients, etc.
Why not just consider an open websocket connection to be like a standing request from the client for more data?

Sharing data system wide

Good evening.
I'm looking for a method to share data from my application system-wide, so that other applications could read that data and then do whatever they want with it (e.g. format it for display, use it for logging, etc). The data needs to be updated dynamically in the method itself.
WMI came to mind first, but then you've got the issue of applications pausing while reading from WMI. Additionally, i've no real idea how to setup my own namespace or classes if that's even possible in Delphi.
Using files is another idea, but that could get disk heavy, and it's a real awful method to use for realtime data.
Using a driver would probably be the best option, but that's a little too intrusive on the users end for my liking, and i've no idea on where to even start with it.
WM_COPYDATA would be great, but i'm not sure if that's dynamic enough, and whether it'll be heavy on resources or not.
Using TCP/IP would be the best choice for over the network, but obviously is of little use when run on a single system with no networking requirement.
As you can see, i'm struggling to figure out where to go with this. I don't want to go into one method only to find that it's not gonna work out in the end. Essentially, something like a service, or background process, to record data and then allow other applications to read that data. I'm just unsure on methods. I'd prefer to NOT need elevation/UAC to do this, but if needs be, i'll settle for it.
I'm running in Delphi 2010 for this exercise.
Any ideas?
You want to create some Client-Server architecture, which is also called IPC.
Using WM_COPYDATA is a very good idea. I found out it is very fast, lightweight, and efficient on a local machine. And it can be broadcasted over the system, to all applications at once (to be used with care if some application does not handle it correctly).
You can also share some memory, using memory mapped files. This is may be the fastest IPC option around for huge amount of data, but synchronization is a bit complex (if you want to share more than one buffer at once).
Named pipes are a good candidates for local. They tend to be difficult to implement/configure over a network, due to security issues on modern Windows versions (and are using TCP/IP for network communication - so you should better use directly TCP/IP instead).
My personal advice is that you shall implement your data sharing with abstract classes, able to provide several implementations. You may use WM_COPYDATA first, then switch to named pipes, TCP/IP or HTTP in order to spread your application over a network.
For our Open Source Client-Server ORM, we implemented several protocols, including WM_COPY_DATA, named pipe, HTTP, or direct in-process access. You can take a look at the source code provided for implementation patterns. Here are some benchmarks, to give you data from real implementations:
Client server access:
- Http client keep alive: 3001 assertions passed
first in 7.87ms, done in 153.37ms i.e. 6520/s, average 153us
- Http client multi connect: 3001 assertions passed
first in 151us, done in 305.98ms i.e. 3268/s, average 305us
- Named pipe access: 3003 assertions passed
first in 78.67ms, done in 187.15ms i.e. 5343/s, average 187us
- Local window messages: 3002 assertions passed
first in 148us, done in 112.90ms i.e. 8857/s, average 112us
- Direct in process access: 3001 assertions passed
first in 44us, done in 41.69ms i.e. 23981/s, average 41us
Total failed: 0 / 15014 - Client server access PASSED
As you can see, fastest is direct access, then WM_COPY_DATA, then named pipes, then HTTP (i.e. TCP/IP). Message was around 5 KB of JSON data containing 113 rows, retrieved from server, then parsed on the client 100 times (yes, our framework is fast :) ). For huge blocks of data (like 4 MB), WM_COPY_DATA is slower than named pipes or HTTP-TCP/IP.
Where are several IPC (inter-process communication) methods in Windows. Your question is rather general, I can suggest memory-mapped files to store your shared data and message broadcasting via PostMessage to inform other application that the shared data changed.
If you don't mind running another process, you could use one of the NoSQL databases.
I'm pretty sure that a lot of them won't have Delphi drivers, but some of them have REST drivers and hence can be driven from pretty much anything.
Memcached is an easy way to share data between applications. Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects).
A Delphi 2010 client for Memcached can be found on google code:
http://code.google.com/p/delphimemcache/
related question:
Are there any Caching Frameworks for Delphi?
Googling for 'delphi interprocess communication' will give you lots of pointers.
I suggest you take a look at http://madshi.net/, especially MadCodeHook (http://help.madshi.net/madCodeHook.htm)
I have good experience with the product.

Resources