Scalability of LongPolling in .Net :Grand Finale - asp.net-mvc

I did my research on how to implement comet like chat on asp.net / MVC.
what i found was it can be done by Long Polling..
about long polling , because it keeps the threads open so many concurrent connections will be made making its porformance poor (or flat zero), cuz IIS aint meant for many concurrent connections
Now the Tools For Business :Pokein, SignalR, SocketIO, Now.Js (Skipping paid tools, Free is pretty :) )
as far as i know all these use long polling ,then what do they actually du to improve performance in IIS (All these can be used with asp.net)..
I also found Facebook uses Erlang (Dunno how to use it) to make it happen & ofcourse $100 million worth of hardware(balancing 70 million user). and FB uses long polling not some comet server( as far as my research goes).
I want to implement scalable long polling on asp.net MVC 3
the two finalsit i found are Here and here
All i want to know which is better and why..
and also which tool is best among the given ones

My opinion would be that SignalR would be the better choice, if not only because if you use SignalR.WebSockets, it will automatically upgrade the connection to web sockets if the user's browser supports it. This way, over time, as users begin to upgrade the browsers and away from the long-polling scheme, the scalability of your chat application will actually get better.
Moreover, there is an awesome code example called JabbR, created by the very people that created SignalR. (who also happen to be developers on the ASP .NET team)
http://jabbr.net/ - an example of SignalR in action.
https://github.com/davidfowl/JabbR - JabbR source.

Though your marked the answer, I am tempted to give this answer as I'd been through this and for long time as well.
I have used the solutions from two big COMET players. One is websync and the other is PokeIn. Web sync was good but expensive. I had lots of problems with PokeIn in terms of successfully using it. I actually did not use this for Chat server but for a push live update where some external program sends/pushes the updates to the subscribed clients.
I suggest you try using IHttpAsyncHandler based logic. This is again a long-polling sort of technique, but the client returns after sending the request and the server can send the response asynchronously.
Sorry for the self-publicity. I have a sample implementation of this in a project named flycomet in codeplex. This simply has a handler which receives requests and based on the type of request responds with the replies if any.
Currently the implementation is not given as a chat server but as a Windows Console Push Client Application and the subscribers can be from asp .net or MVC or silverlight. The advantage is you can tune the code to scale for yourself.
If you want to modify this as a chat application, it is quite easy to push the data through jQuery.

Related

Websocket scalability, broadcasting concerns

If you have a complex requirement set with many users(&servers) how will your websocket infrastructure (server[s]) will scale, especially with broadcasting?
Of course, broadcasting is not part of the any websocket spec but it's there even in basic chat examples (a.k.a. hello world for websocket).
Client side (asking for new data) solution still seems more scalable than server side (broadcasting) solution with websockets' low latency and relatively cheap (http headerless) nature.
Edit:
OK, just think that you want to replace all your ajax code with websocket implementations which may mean that so many connections within so many different contexts. This adds enormous complexity to your system if you want to keep track of every possible scenario for broadcasting.
Low (network/thread etc) level implementation suggestions are also part of the problem not the solution, because this means you have to code a special server unlike general http servers.
Moreover, broadcasting brings some sort of stateful nature to the table which can't easily scale. Think about adding more servers and load balancing.
Scaling realtime web solutions can be a complex problem but one that services like Pusher (who I work for) have solved, and one that there are most definitely solutions defined for self hosted realtime web solutions - the PubSub paradigm is well understood and has been solved many times and in order to solve the problem there needs to be some state (who is subscribing to what). This paradigm is used in broadcasting the the types of scenarios that you are talking about.
Realtime web technologies have been built with large amounts of simultaneous connections in mind - many from the ground up. If you wanted to create a scalable solution you would most likely use an existing realtime web server that supports WebSockets, in the same way that it's highly unlikely that you would decide to implement your own HTTP Server you are unlikely to want to implement your own server which supports WebSockets from scratch.
Dedicated Realtime web servers also let you separate your application logic from your realtime communication mechanism (separation of concerns). Your application might need to maintain some state but the realtime technology deals with managing subscriptions and connections. How communication between the application and the realtime web technology is achieved is up to you but frequently messages queues are used and specifically redis is very popular in this space.
HTTP polling may conceptually be easier to understand - you can maintain statelessness and with each HTTP poll request you specify exactly what you are looking for. But it most definitely means that you will need to start scaling much sooner (adding more resource to handle the load).
WebSocket polling is something I've not considered before and I don't think I've seen it suggested anywhere before either; the idea that the client should say "I'm ready for my next set of data and here's what I want" is an interesting one. WebSockets have generally taken a leap away from the request/response paradigm but there may be scenarios where the increased efficiency of WebSockets and request/response using them may have some benefits. The SocketStream application framework might be worth a look as it might be relevant; after the initial application load all communication is performed over WebSockets which means that event basic request/response functionality uses WebSockets.
However, since we are talking about broadcasting data we need to go back to the PubSub paradigm where it makes much more sense to have active subscriptions and when new data is available that new data is distributed to those active subscriptions (pushed). All your application needs to know is if there are any active subscriptions or not in order to decide whether to publish the data or not. That problem has been solved.
The idea of websockets is that you keep a persistent connection with each client. When there is new data that you want to send to every client, you already know who all the clients are so you should just send it.
It sound like you want each client to constantly be sending requests to the server for new data. Why? It seems like that would waste everyone's bandwidth and I don't know why you think it will be more scalable. Maybe you could add more detail to your question like what kind of information you are broadcasting, how often, how many bytes, how many clients, etc.
Why not just consider an open websocket connection to be like a standing request from the client for more data?

Erlang web-distribution

On non-web based chat system the server distinguishes its clients by their PIDs, right? And what should be used to distinguish the clients on web-based chat system?
Thnx in advance
The fact that you're using a web server shouldn't change much about your model. You're still building chat. You also don't want to make your chats tied too deeply to the process that is managing their HTTP connection. HTTP connections are ephemeral, even if everything is going well and you're using long polling there's no guarantee that the connection will be re-used with Keep-Alive for the next long poll. The user might also want to open up the same chat in multiple browser windows, multiple computers, whatever.
I haven't looked closely at any of these but you're not the first person that has built web chat with Erlang:
http://chrismoos.com/2009/09/28/building-an-erlang-chat-server-with-comet-part-1/
http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf
http://yoan.dosimple.ch/blog/2008/05/15/
https://github.com/yrashk/socket.io-erlang (more of a general tool for this sort of thing, not chat specifically)
https://github.com/rvirding/chat_demo (as seen above)
I think the confusion comes from the notion that a Erlang server process must stay alive for every individual client. It can, but Mochiweb doesn't do that by default if I'm not mistaken. It just spawns a new process for every request. If you would like to have a long lived bidirectional client <-> server process connection you can do that for example by;
sending a client identifier with every request and map that to a long-lived process on the server. The process will maintain servers state and you can call methods on it. It's still pull and not push though.
use the web socket implementations. Not sure if Mochiweb has one, but other Erlang HTTP servers like Misultin and Yaws provide one. For a web based chat system I believe web sockets would be a great fit.
For a very trivial example of a web-based chat system using websockets and Misultin you can check out this chat demo. It was written to demonstrate an idea and is not very elegant, but it does work.

Real-time ASP.NET MVC Web Application

I need to add a "real-time" element to my web application. Basically, I need to detect "changes" which are stored in a SQL Server table, and update various parts of the UI when a change has occured.
I'm currently doing this by polling. I send an ajax request to the server every 3 seconds asking for any new changes - these are then returned and processed. It works, but I don't like it - it means that for each browser I'll be issuing these requests frequently, and the server will always be busy processing them. In short, it doesn't scale well.
Is there any clever alternative that avoids polling overhead?
Edit
In the interests of completeness, I'm updating this to mention the solution we eventually went with - SignalR. It's OS and comes from Microsoft. It's risen in popularity, and I can heartily recommend this, or indeed WebSync which we also looked at.
Check out WebSync, a comet server designed for ASP.NET/IIS.
In particular, what I would do is use the SQL Dependency class, and when you detect a change, use RequestHandler.Publish("/channel", data); to send out the info to the appropriate listening clients.
Should work pretty nicely.
taken directly from the link refernced by Jakub (i.e.):
Reverse AJAX with IIS/ASP.NET
PokeIn on codeplex gives you an enhanced JSON functionality to make your server side objects available in client side. Simply, it is a Reverse Ajax library which makes it easy to call JavaScript functions from C#/VB.NET and to call C#/VB.NET functions from JavaScript. It has numerous features like event ordering, resource management, exception handling, marshaling, Ajax upload control, mono compatibility, WCF & .NET Remoting integration and scalable server push.
There is a free community license option for this library and the licensing option is quite cost effective in comparison to others.
I've actually used this and the community edition is pretty special. well worth a look as this type of tech will begin to dominate the landscape in the coming months/years. the codeplex site comes complete with asp.net mvc samples.
No matter what: you will always be limited to the fact that HTTP is (mostly) a one-way street. Unless you implement some sensible code on the client (ie. to listen to incoming network requests) anything else will involve polling the server for updates, no-matter what others will tell you.
We had a similar requirement: to have very fast response time in one of our real-time web applications, serving about 400 - 500 clients per web server. Server would need to notify the clients almost within 0.1 of a second (telephony & VoIP).
In the end we implemented an Async Handler. On each polling request we put the request to sleep for 5 seconds, waiting for a semaphore pulse signal to respond to the client. If the 5 seconds are up, we respond with a "no event" and the client will post the request again (immediately). This resulted in very fast response times, and we never had any problems with up to 500 clients per machine.. no idea how many more we could add before the polling requests might create a problem.
take a look at this article
I've read somewhere (didn't remember where) that using this WCF feature make the host process handle requests in a way that didn't consume blocked threads.
Depending on the restrictions on you application you can use Silverlight to do this connection. You don't need to have any UI for Silverlight, but you can use Sockets have a connection that accepts server side pushes of data.

Is it possible to build a web-based chat client without a socket-based framework?

I have heard that web-based chat clients tend to use networking frameworks such as the twisted framework.
But would it be possible to build a web-based chat client without a networking framework - using only ajax connections?
I would like to build a session-based one-to-one web chat client that uses sessions to indicate when a chat has ended. Would this be possible in Rails using only ajax and without a networking framework?
What effect does it have to use a networking framework and what impact would it have on my app to not use one? Also any general recommendations for approaching this project would be appreciated.
If i understand you correctly, you want to have to clients connect to you server and send messaged to each other to each other through ajax, via the server.
This is possible, there are two approaches to do this.
The easy approach is to have both client poll every few seconds to check for new messages posted by the other. Drawback is that the messages are not instantly delivered. I think this is an example found in the rails book.
The more complex approach is to keep an open connection and sent the messages to the client as soon as they are received by the server. To do this you can use something like Juggernaut
I would like to add that though the latter works, it is not something http was meant for and it a bit of hack, but hey, whatever gets the job done. A working example of this is the rails chat project which uses a juggernaut derivative.
Technically speaking every network based application has a networking framework under it and, therefore, is socket based...
The only real question here is whether you want to have all that chatter go through your server or allow point to point communication. If the former, you can use the ajax framework to talk to your web server. This means that all of your clients will be constantly polling the web server for updates.
If the later, then you have to allow direct tcp connections between the two clients and need to get a little closer to the metal so to speak.
So, ask yourself this: Do you want to pay for the traffic costs AND have potential liability over divulging whatever it is that people might be typing into their client; or, would you rather just build a chat program that people can use to talk to each other?
Of course, before even going that far, do you really want to build yet another chat client? That space is already pretty crowded.

How should I monitor potential threats to my site?

By looking at our DB's error log, we found that there was a constant stream of almost successful SQL injection attacks. Some quick coding avoided that, but how could I have setup a monitor for both the DB and Web server (including POST requests) to check for this? By this I mean if there are off the shelf tools for script-kiddies, are there off the shelf tools that will alert you to their sudden random interest in your site?
Funnily enough, Scott Hanselman had a post on UrlScan today which is one thing you could do to help monitor and minimize potential threats. It's a pretty interesting read.
UrlScan does seem like a nice option for iis6 and 7; I also found: dotDefender for pay which also covers Apache or IIS 5-7, and I had found an SQL Injection sanitation ISAPI
It is also worth noting in light of a recent wide spread SQL Injection attempt that dissallowing your webapp's db user account from querying the system tables (in MS SQL Server it's sysobjects and syscolumns) is a good idea.
I think this thread warrants more free solutions for Apache and other web servers.
Unfortunately intrusion detection was not what I had in mind, so sgfree isn't exactly a web site attack monitor, unless I'm not understanding how it works.
If you could go back and modify your app code, I'd suggest getting log4j/log4net integrated into the application. From there you could write code that would check a form field or URL (say at the global.asax level for .NET apps) and make a log entry when malicious code is detected.
The nice thing about log4j/log4net is that you can configure an e-mail/pager/SMS type appender so as soon as the malicious attempt was caught, you would be notified.
I'm in the process of merging some log4net code into our CMS system we have and I'm looking to do just this in light of the influx of ASPRox attacks that have been coming our way.
Monitoring web and DB access logs should alert you to things like this, but if you want a more fully featured alert system I would suggest some kind of IDS/IPS. You'll need a spare machine though, and a switch that can do port mirroring.
If you have those then an IDS is a cheap way of monitoring your traffic for many intrusion attempts (there will be lots). Snort (www.snort.org) based IDSes are excellent, and there are some free fully packaged versions available. One I have used is StrataGuard (http://sgfree.stillsecure.com/), and it can be configured as an IDS (Intrusion Detection System) or as an IPS (Intrusion Prevention System). It's free to use if your traffic does not exceed 5Mbps.
If you do go with an IDS/IPS I'd advise you to let it run as a simple IDS for a month or so, before you allow it to prevent attacks.
This may be overkill, but if you have a spare machine lying around it can't hurt to have an IDS running passively.
You can set up your system to kick out some error message that then makes a JSON or http call to a system that will monitor, report (log) and send out any kind of alert such as SMS/email or a phone call.
Check out developer.alertcaster.com
Especially if you need to monitor multiple simultaneous events, which it sounds like you have going on, this might be a good fix.

Resources