Large number of WebSocket connections - ruby-on-rails

I am writing an application that keeps track of content pushed around between users of a certain task. I am thinking of using WebSockets to send down new content as they are available to all users who are currently using the app for that given task.
I am writing this on Rails and the client side app is on iOS (probably going to be in Android too). I'm afraid that this WebSocket solution might not scale well. I am after some advice and things to consider while making the decision to go with WebSockets vs. some kind of polling solution.
Would Ruby on Rails servers (like Heroku) support large number of WebSockets open at the same time? Let's say a million connections for argument sake. Any material anyone can provide me of such stuff?
Would it cost a lot more on server hosting if I architect it this way?
Is it even possible to maintain millions of WebSockets simultaneously? I feel like this may not be the best design decision.
This is my first try at a proper Rails API. Any advice is greatly appreciated. Thx.

Million connections over WebSockets, using Ruby, I can't see its real if you not using clustering to spread connections between different instances to handle all the data processing.
The problem here is serializing and deserializing data.
As well you have to research of how often you will need to pull data to client from server, and if it worth to have just periodical checks using AJAX, then handling connection for whole time. Because if you do handle connection and then you not using it - it is waste of resources. WebSockets are build on top of TCP layer, and all connections are not "cheap" as well going through for OS and asking them for data available again is not the simple process, with millions connections it is something really almost impossible without using most advanced technologies in the world.
I head that Erlang is able to handle millions of connections, but I don't have details over it. As well connection is one thing, another is processing data and interaction between connections - this you might want to check, because if you have heavy processing algorithms, then you definitely need to look into horizontal scaling options over clustering solutions.

If you are implementing chat, use websockets.
If you are implementing 1 way messages in realtime use server sent events.
If you are implementing 1 way messages sent every few hours or so, use APNS.
The saying goes phone in hand, use websockets / server sent events.
Phone in pocket, use APNS.
APNS will alleviate wifi dips, tcp/ip socket hangs and many other issues. Really useful. There is the chance that it may take a little time to get through. But then again, there is the chance that websockets will take
Recent versions of iOS let you send APNS to the client without a popup message to the client so it can ask the server for more information. That along with some backgrounding implementations really improves things.
If possible, do not implement totally anonymous clients. It is very tricky to detect if a client reinstalls the app. So you'll end up sending duplicates to the client. Need to take that into account.
APNS looks trivial to implement in ruby, but I'd suggest avoiding the urge and going to using an existing gem/service out there that supports both google and apple. It is much trickier to implement than it may seem at first.
If you decide to stick with websockets, it may make sense to just leverage websockets in nginx like https://github.com/wandenberg/nginx-push-stream-module
ASIDE:
Using SMS where speed is critical is very expensive. $1/month per phone number only sends a max rate of 1 message per second. So sending 100 messages per second = $100/month plus message fees. Do note that 100 messages at a rate of 50 messages/second = $50/month. But if you want to send 1k messages, that takes 20 seconds.
Good luck

Related

Differences between websockets and long polling for turn based game server

I am writing a server for an iOS game. The game is turn-based and the only time the server needs to push information to the client is to notify of the opponent's move.
I am curious if anyone could comment on the performance and ease of implementation differences between using WebSockets and long polling. Also, if I used WebSockets, should I only use it to receive information and send POST requests for everything else, or should all communication be through the WebSocket?
Additionally, is there anything extra to consider between WebSockets and long polling if I am interested in also making a web client?
For anyone else who may be wondering, it could depends on how long typical interactions go between events?
Websocket: Anything more than a few tens of seconds, I don't think keeping a websocket open is particularly efficient (not to mention that IIRC it would disconnect anyway if the app loses focus)
Long polling: This forces a trade-off between server load (anything new now? how about now? ...) and speediness of knowing a change has occurred.
Push notifications: While this may be technically more complex to implement, it really would be the best solution IMO, since:
the notification can be sent (and delivered) almost immediately after an event occurs
there is no standby server load (either from open websockets, or "how about now?" queries) - which is especially important as your use-base grows
you can override what happens if a notification comes in while the user is in-app
should I only use it to receive information and send POST requests for
everything else
Yes, you should use WebSockets to fetch real-time updates only, and REST APIs to do BREAD stuff.
should all communication be through the WebSocket?
Short answer: No,
Check this article from PieSocket for more information about the best use cases for WebSockets. What Is WebSocket: Introduction And Usage

What technology do apps use for real time syncing of data?

A lot of apps such as Uber, Lyft, and GroupMe seem like they have real-time data getting pushed down from the server. Obviously they could be faking it by refreshing every n seconds. Another thought was that they could be opening TCP sockets? Or potentially other technologies that I am unaware of.
If programming an iOS app what is the industry standard for syncing data between the client and the server in real time, without user interaction such as swiping up?
WebSockets or polling are the general solutions. Push Notifications can also be used to trigger a poll in some cases.
In addition to Neal's answer, check out the Rocket technique, which "leverages web standards like Server-Sent Events and JSON Patch."
AFRocketClient uses AFNetworking to support this on iOS/Mac, as long as the server supports these technologies.

What is the performance overhead of Apache ActiveMQ vs. raw sockets?

We're looking to implement ActiveMQ to handle messaging between two of our servers, over a geographically diverse environment (Australia to the UK and back, via the internet).
I've been looking for some vague indicators of performance round the net but so far have had no luck.
My question: compared to a DIY TCP/SSL implementation of basic messaging, how would ActiveMQ perform? Similar systems of our own can send and receive messages across Australia in 100-150ms, over a SSL layer with an already established connection.
Also, does ActiveMQ persist its TLS/SSL connections, thus saving a substantial amount of time that would already be used in connection creation/teardown?
What I am hoping is that it will at least perform better than HTTPS, at a per-request level.
I am aware that performance can vary remarkably, depending on hardware, networks, code and so on. I'm just after something to start with.
I know the above is a little fuzzy - if you need any clarification please let me know and I will only be too happy to oblige.
Thank you.
What Tim means is that this is not an apples to apples comparison. If you are solely concerned with the performance of a single point to point connection to transfer data, a direct link will give you a good result (although DIY is still a dubious design decision). If you are building a system that requires the transfer of data and you have more complex functional requirements, then a broker-based messaging platform like ActiveMQ will come into play.
You should consider broker-based messaging if you want:
a post-office style system where a producer sends a message, and knows that it will be consumed at some point, even if there is no consumer there at that time
to not care where the consumer of a message is, or how many of them there are
a guarantee that a message will be consumed, even if the consumer that first handle it dies mid-way through the process (transactions, redelivery)
many consumers, with a guarantee that a message will only be consumed once - queues
many consumers that will each react to a single message - topics
These patterns are pretty standard, and apply to all off the shelf messaging products. As a general rule, DIY in this domain is a bad idea, as messaging is complex (see http://www.ohloh.net/p/activemq/estimated_cost for an estimate of how long it would take you do do same); and has many existing implementations of various flavours (some without a broker) that are all well used, commercially supported and don't require you to maintain them. I would think very hard before going down to the TCP level for any sort of data transfer as there is so much prior art.

Websocket scalability, broadcasting concerns

If you have a complex requirement set with many users(&servers) how will your websocket infrastructure (server[s]) will scale, especially with broadcasting?
Of course, broadcasting is not part of the any websocket spec but it's there even in basic chat examples (a.k.a. hello world for websocket).
Client side (asking for new data) solution still seems more scalable than server side (broadcasting) solution with websockets' low latency and relatively cheap (http headerless) nature.
Edit:
OK, just think that you want to replace all your ajax code with websocket implementations which may mean that so many connections within so many different contexts. This adds enormous complexity to your system if you want to keep track of every possible scenario for broadcasting.
Low (network/thread etc) level implementation suggestions are also part of the problem not the solution, because this means you have to code a special server unlike general http servers.
Moreover, broadcasting brings some sort of stateful nature to the table which can't easily scale. Think about adding more servers and load balancing.
Scaling realtime web solutions can be a complex problem but one that services like Pusher (who I work for) have solved, and one that there are most definitely solutions defined for self hosted realtime web solutions - the PubSub paradigm is well understood and has been solved many times and in order to solve the problem there needs to be some state (who is subscribing to what). This paradigm is used in broadcasting the the types of scenarios that you are talking about.
Realtime web technologies have been built with large amounts of simultaneous connections in mind - many from the ground up. If you wanted to create a scalable solution you would most likely use an existing realtime web server that supports WebSockets, in the same way that it's highly unlikely that you would decide to implement your own HTTP Server you are unlikely to want to implement your own server which supports WebSockets from scratch.
Dedicated Realtime web servers also let you separate your application logic from your realtime communication mechanism (separation of concerns). Your application might need to maintain some state but the realtime technology deals with managing subscriptions and connections. How communication between the application and the realtime web technology is achieved is up to you but frequently messages queues are used and specifically redis is very popular in this space.
HTTP polling may conceptually be easier to understand - you can maintain statelessness and with each HTTP poll request you specify exactly what you are looking for. But it most definitely means that you will need to start scaling much sooner (adding more resource to handle the load).
WebSocket polling is something I've not considered before and I don't think I've seen it suggested anywhere before either; the idea that the client should say "I'm ready for my next set of data and here's what I want" is an interesting one. WebSockets have generally taken a leap away from the request/response paradigm but there may be scenarios where the increased efficiency of WebSockets and request/response using them may have some benefits. The SocketStream application framework might be worth a look as it might be relevant; after the initial application load all communication is performed over WebSockets which means that event basic request/response functionality uses WebSockets.
However, since we are talking about broadcasting data we need to go back to the PubSub paradigm where it makes much more sense to have active subscriptions and when new data is available that new data is distributed to those active subscriptions (pushed). All your application needs to know is if there are any active subscriptions or not in order to decide whether to publish the data or not. That problem has been solved.
The idea of websockets is that you keep a persistent connection with each client. When there is new data that you want to send to every client, you already know who all the clients are so you should just send it.
It sound like you want each client to constantly be sending requests to the server for new data. Why? It seems like that would waste everyone's bandwidth and I don't know why you think it will be more scalable. Maybe you could add more detail to your question like what kind of information you are broadcasting, how often, how many bytes, how many clients, etc.
Why not just consider an open websocket connection to be like a standing request from the client for more data?

Is it possible to build a web-based chat client without a socket-based framework?

I have heard that web-based chat clients tend to use networking frameworks such as the twisted framework.
But would it be possible to build a web-based chat client without a networking framework - using only ajax connections?
I would like to build a session-based one-to-one web chat client that uses sessions to indicate when a chat has ended. Would this be possible in Rails using only ajax and without a networking framework?
What effect does it have to use a networking framework and what impact would it have on my app to not use one? Also any general recommendations for approaching this project would be appreciated.
If i understand you correctly, you want to have to clients connect to you server and send messaged to each other to each other through ajax, via the server.
This is possible, there are two approaches to do this.
The easy approach is to have both client poll every few seconds to check for new messages posted by the other. Drawback is that the messages are not instantly delivered. I think this is an example found in the rails book.
The more complex approach is to keep an open connection and sent the messages to the client as soon as they are received by the server. To do this you can use something like Juggernaut
I would like to add that though the latter works, it is not something http was meant for and it a bit of hack, but hey, whatever gets the job done. A working example of this is the rails chat project which uses a juggernaut derivative.
Technically speaking every network based application has a networking framework under it and, therefore, is socket based...
The only real question here is whether you want to have all that chatter go through your server or allow point to point communication. If the former, you can use the ajax framework to talk to your web server. This means that all of your clients will be constantly polling the web server for updates.
If the later, then you have to allow direct tcp connections between the two clients and need to get a little closer to the metal so to speak.
So, ask yourself this: Do you want to pay for the traffic costs AND have potential liability over divulging whatever it is that people might be typing into their client; or, would you rather just build a chat program that people can use to talk to each other?
Of course, before even going that far, do you really want to build yet another chat client? That space is already pretty crowded.

Resources