Architecture of a simple chat web app at scale - firebase-realtime-database

I need to add a chat in my application to let users contact each others.
Requirements
only 1 to 1 communication customer 2 customer (no group or chat room)
essentially text, image upload is a bonus (probably as a second step)
message must be delivered in a reasonable delay (maybe ... 20 sec max)
max load: 3M chat msg / days,
Protocol / API
I only have way back memories from the university and the TCP sockets, a recent trial to gRPC & protocol buffers but none of these looks like a good fit.
Web Socket ?
Then, I've found some articles about the Web Socket protocol and an implementation in Go from the Gorilla team and the Web Socket API from MDN
HTTP/3 ?
WebTransport session, bidirectional stream
Caddy HTTP/3 server or an implementation of Web Transport from Marten Seemann based on quic-go
I also take a look at nsq but it looks like a Rube Goldberg machine in this context.
PubSub API
Google provides a PubSub API that can be connected to BigQuery and Dataflow
Persistence ... ?
Cassandra or MongoDB look like good options here...
Firebase Realtime database / Firestore? Advantage here is to abstract a lot of complexity, we just send data and subscribe to updates. Not sure about the cost efficiency vs the other solutions.
A complete solution has been given by minghsu0107 here: go-random-chat...I'm not skilled enough to have any thought on this architecture. The only thing I'm thinking about is that this solution is maintained by a single person ... which mean if I chose to use it, I must be able to understand every single piece of it. So if someone could just set me on the "right" path, or at least to move me away from the "wrong" ones before I spend weeks on these concepts, that'd be awesome :D

Given the background you've supplied, I think you'd be better off integrating an open-source, off-the-shelf messaging plugin, rather than trying to build your own.
Something like Rocket or Zulip might suit (and save you a lot of grief).

I am the author of go-random-chat. The main difference between 1-to-1 and group chat is that the message status in 1-to-1 chat is either seen or unseen, whereas in group chat it's a counter to record the number of users that have seen the message. Therefore, you can still refer to those Slack-like implementations besides go-random-chat to see how messaging works under the hood.

Related

Large number of WebSocket connections

I am writing an application that keeps track of content pushed around between users of a certain task. I am thinking of using WebSockets to send down new content as they are available to all users who are currently using the app for that given task.
I am writing this on Rails and the client side app is on iOS (probably going to be in Android too). I'm afraid that this WebSocket solution might not scale well. I am after some advice and things to consider while making the decision to go with WebSockets vs. some kind of polling solution.
Would Ruby on Rails servers (like Heroku) support large number of WebSockets open at the same time? Let's say a million connections for argument sake. Any material anyone can provide me of such stuff?
Would it cost a lot more on server hosting if I architect it this way?
Is it even possible to maintain millions of WebSockets simultaneously? I feel like this may not be the best design decision.
This is my first try at a proper Rails API. Any advice is greatly appreciated. Thx.
Million connections over WebSockets, using Ruby, I can't see its real if you not using clustering to spread connections between different instances to handle all the data processing.
The problem here is serializing and deserializing data.
As well you have to research of how often you will need to pull data to client from server, and if it worth to have just periodical checks using AJAX, then handling connection for whole time. Because if you do handle connection and then you not using it - it is waste of resources. WebSockets are build on top of TCP layer, and all connections are not "cheap" as well going through for OS and asking them for data available again is not the simple process, with millions connections it is something really almost impossible without using most advanced technologies in the world.
I head that Erlang is able to handle millions of connections, but I don't have details over it. As well connection is one thing, another is processing data and interaction between connections - this you might want to check, because if you have heavy processing algorithms, then you definitely need to look into horizontal scaling options over clustering solutions.
If you are implementing chat, use websockets.
If you are implementing 1 way messages in realtime use server sent events.
If you are implementing 1 way messages sent every few hours or so, use APNS.
The saying goes phone in hand, use websockets / server sent events.
Phone in pocket, use APNS.
APNS will alleviate wifi dips, tcp/ip socket hangs and many other issues. Really useful. There is the chance that it may take a little time to get through. But then again, there is the chance that websockets will take
Recent versions of iOS let you send APNS to the client without a popup message to the client so it can ask the server for more information. That along with some backgrounding implementations really improves things.
If possible, do not implement totally anonymous clients. It is very tricky to detect if a client reinstalls the app. So you'll end up sending duplicates to the client. Need to take that into account.
APNS looks trivial to implement in ruby, but I'd suggest avoiding the urge and going to using an existing gem/service out there that supports both google and apple. It is much trickier to implement than it may seem at first.
If you decide to stick with websockets, it may make sense to just leverage websockets in nginx like https://github.com/wandenberg/nginx-push-stream-module
ASIDE:
Using SMS where speed is critical is very expensive. $1/month per phone number only sends a max rate of 1 message per second. So sending 100 messages per second = $100/month plus message fees. Do note that 100 messages at a rate of 50 messages/second = $50/month. But if you want to send 1k messages, that takes 20 seconds.
Good luck

What is the performance overhead of Apache ActiveMQ vs. raw sockets?

We're looking to implement ActiveMQ to handle messaging between two of our servers, over a geographically diverse environment (Australia to the UK and back, via the internet).
I've been looking for some vague indicators of performance round the net but so far have had no luck.
My question: compared to a DIY TCP/SSL implementation of basic messaging, how would ActiveMQ perform? Similar systems of our own can send and receive messages across Australia in 100-150ms, over a SSL layer with an already established connection.
Also, does ActiveMQ persist its TLS/SSL connections, thus saving a substantial amount of time that would already be used in connection creation/teardown?
What I am hoping is that it will at least perform better than HTTPS, at a per-request level.
I am aware that performance can vary remarkably, depending on hardware, networks, code and so on. I'm just after something to start with.
I know the above is a little fuzzy - if you need any clarification please let me know and I will only be too happy to oblige.
Thank you.
What Tim means is that this is not an apples to apples comparison. If you are solely concerned with the performance of a single point to point connection to transfer data, a direct link will give you a good result (although DIY is still a dubious design decision). If you are building a system that requires the transfer of data and you have more complex functional requirements, then a broker-based messaging platform like ActiveMQ will come into play.
You should consider broker-based messaging if you want:
a post-office style system where a producer sends a message, and knows that it will be consumed at some point, even if there is no consumer there at that time
to not care where the consumer of a message is, or how many of them there are
a guarantee that a message will be consumed, even if the consumer that first handle it dies mid-way through the process (transactions, redelivery)
many consumers, with a guarantee that a message will only be consumed once - queues
many consumers that will each react to a single message - topics
These patterns are pretty standard, and apply to all off the shelf messaging products. As a general rule, DIY in this domain is a bad idea, as messaging is complex (see http://www.ohloh.net/p/activemq/estimated_cost for an estimate of how long it would take you do do same); and has many existing implementations of various flavours (some without a broker) that are all well used, commercially supported and don't require you to maintain them. I would think very hard before going down to the TCP level for any sort of data transfer as there is so much prior art.

Websocket scalability, broadcasting concerns

If you have a complex requirement set with many users(&servers) how will your websocket infrastructure (server[s]) will scale, especially with broadcasting?
Of course, broadcasting is not part of the any websocket spec but it's there even in basic chat examples (a.k.a. hello world for websocket).
Client side (asking for new data) solution still seems more scalable than server side (broadcasting) solution with websockets' low latency and relatively cheap (http headerless) nature.
Edit:
OK, just think that you want to replace all your ajax code with websocket implementations which may mean that so many connections within so many different contexts. This adds enormous complexity to your system if you want to keep track of every possible scenario for broadcasting.
Low (network/thread etc) level implementation suggestions are also part of the problem not the solution, because this means you have to code a special server unlike general http servers.
Moreover, broadcasting brings some sort of stateful nature to the table which can't easily scale. Think about adding more servers and load balancing.
Scaling realtime web solutions can be a complex problem but one that services like Pusher (who I work for) have solved, and one that there are most definitely solutions defined for self hosted realtime web solutions - the PubSub paradigm is well understood and has been solved many times and in order to solve the problem there needs to be some state (who is subscribing to what). This paradigm is used in broadcasting the the types of scenarios that you are talking about.
Realtime web technologies have been built with large amounts of simultaneous connections in mind - many from the ground up. If you wanted to create a scalable solution you would most likely use an existing realtime web server that supports WebSockets, in the same way that it's highly unlikely that you would decide to implement your own HTTP Server you are unlikely to want to implement your own server which supports WebSockets from scratch.
Dedicated Realtime web servers also let you separate your application logic from your realtime communication mechanism (separation of concerns). Your application might need to maintain some state but the realtime technology deals with managing subscriptions and connections. How communication between the application and the realtime web technology is achieved is up to you but frequently messages queues are used and specifically redis is very popular in this space.
HTTP polling may conceptually be easier to understand - you can maintain statelessness and with each HTTP poll request you specify exactly what you are looking for. But it most definitely means that you will need to start scaling much sooner (adding more resource to handle the load).
WebSocket polling is something I've not considered before and I don't think I've seen it suggested anywhere before either; the idea that the client should say "I'm ready for my next set of data and here's what I want" is an interesting one. WebSockets have generally taken a leap away from the request/response paradigm but there may be scenarios where the increased efficiency of WebSockets and request/response using them may have some benefits. The SocketStream application framework might be worth a look as it might be relevant; after the initial application load all communication is performed over WebSockets which means that event basic request/response functionality uses WebSockets.
However, since we are talking about broadcasting data we need to go back to the PubSub paradigm where it makes much more sense to have active subscriptions and when new data is available that new data is distributed to those active subscriptions (pushed). All your application needs to know is if there are any active subscriptions or not in order to decide whether to publish the data or not. That problem has been solved.
The idea of websockets is that you keep a persistent connection with each client. When there is new data that you want to send to every client, you already know who all the clients are so you should just send it.
It sound like you want each client to constantly be sending requests to the server for new data. Why? It seems like that would waste everyone's bandwidth and I don't know why you think it will be more scalable. Maybe you could add more detail to your question like what kind of information you are broadcasting, how often, how many bytes, how many clients, etc.
Why not just consider an open websocket connection to be like a standing request from the client for more data?

Scalable Pub/Sub engine for realtime

I'm looking for a pub/sub engine, with the following requirements:
Very low latency < 0.5 sec
Scalable
Shardable (based on geo localisation)
I'd like to be able to have multiple pub/sub servers and be able to publish or subscribe to channels from any server, no matter on which server the channel is declared.
For example:
If user A is connected to server SRV1 and user B connected to server SRV2, If user B subscribe to "MyChannel" and user A publish something on channel "MyChannel", user B will get the message even if he's not connected to the same server.
I don't know if Redis is able to do that. I didn't find anything about the subject.
We've been using ZeroMQ and it's pub/sub features for while now and we're very happy with what we're seeing.
It's also worth looking at what's coming up in the next version (reducing network bandwidth by pushing subscription requests upstream)
I suggest you look at Data Distribution Service for Real Time Systems (DDS) standard. It's specifically designed to be a scalable pub/sub middleware both for real-time and non-real-time systems.
It has a few mature implementations all of which has it's own strength points, but generally the implementations are scalable and low latency.
These are the implementations I would suggest you look at (If you need them to work on a WAN environment, I guess the first two ones have great support for that):
RTI DDS
OpenSplice DDS
OpenDDS
CoreDX DDS
It seems you are looking for some sort of messaging. Try RabbitMQ, with shovel or federation plugins
nanomsg is a successor to ZeroMQ, written by the same author, and with many language-bindings.
It is written in C and uses zero-copy mechanisms. If you're looking for exceptional latency, and are willing to get your hands dirty ( which you should if you aim for something extreme ), I would recommend one of those two.
If you're looking for exceptional throughput, go with Kafka.
Note, none of those solutions implement geolocation out of the box, Redis 3.2 will have something : http://antirez.com/news/89
If you are looking for an easy, "good enough" solution, I'd go with Redis ( be sure to read this blog post from aphyr first ).

How to communicate within this system?

We intend to design a system with three "tiers".
HQ, with a single server
lots of "nodes" on a regional basis
users, with iPads.
HQ communicates 2-way with the nodes which communciate 2-way with the users. Users never communicate with HQ nor vice-versa.
The powers that be decree a Windows app from HQ (using Delphi) and a native desktop app for the users' iPads. They have no opinion on the nodes.
If there are compelling technical arguments, I might be able to beat them down from "decree" to "prefer" on the Windows program (and, for isntance, make it browser based). The nodes have no GUI, they just sit there playing middle-man.
What's the best way for these things to communicate (SOAP/HTTP/AJAX/jQuery/home-brewed-protocol-on-top-of-TCP/something-else?) Is it best to use the same protocol end to end, or different protocols for hq<-->node and node<-->iPad?
Both ends of each of those two interfaces might wish to initiate a transaction (which I can easily do if I roll my own protocol), so should I use push/pull/long-poll or what?
I hope that this description makes sense. Please ask questions if it does not. Thanks.
Update:
File size is typcially below 1MB with nothing likely to be above 10MB or even 5MB. No second file will be sent before a first file is acknowledged.
Files flow "downhill" from HQ to node to iPad. Files will never flow "uphill", but there will be some small packets of data (in addition to acks) which are initiated by user action on the iPad. These will go to the local node and then to the HQ. We are probably talking <128 bytes.
I suppose there will also be general control & maintenance traffic at a low rate, in all directions.
For push / pull (publish / subscribe or peer to peer communication), cross-platform message brokers could be used. I am not sure if there are (iOS) client libraries for Microsoft Message Queue (MSMQ), but I would also evaluate open source solutions like HornetQ, Apache ActiveMQ, Apollo, OpenMQ, Apache QPid or RabbitMQ.
All these solutions provide a reliable foundation for distributed messaging, like failover, clustering, persistence, with high performance and many clients attached. On this infrastructure message with any content type (JSON, binary, plain text) can be exchanged, and on top messages can contain routing and priority information. They also support transacted messaging.
There are Delphi and Free Pascal client libraries available for many enterprise quality open source messaging products. (I am am the author of some of them, supporting ActiveMQ, Apollo, HornetQ, OpenMQ and RabbitMQ)
Check out MessagePack: http://msgpack.org/
Also, here's more RPC discussion on SO:
RPC frameworks available?
MessagePack: fast cross-platform serializer and RPC - please share experience
ICE might be of interest to you: http://zeroc.com/index.html
They have an iOS layer: http://zeroc.com/icetouch/index.html
IMHO there are too little requisites to decide what technology to use. What data are exchanged, how often, what size? Are there request/response time constraints? etc. etc. Never start selecting a technology before you understand your needs deeply.

Resources