Best practice for rate limiting users of a REST API? - ruby-on-rails

I am putting together a REST API and as I'm unsure how it will scale or what the demand for it will be, I'd like to be able to rate limit uses of it as well as to be able to temporarily refuse requests when the box is over capacity or if there is some kind of slashdotted scenario.
I'd also like to be able to gracefully bring the service down temporarily (while giving clients results that indicate the main service is offline for a bit) when/if I need to scale the service by adding more capacity.
Are there any best practices for this kind of thing? Implementation is Rails with mysql.

This is all done with outer webserver, which listens to the world (i recommend nginx or lighttpd).
Regarding rate limits, nginx is able to limit, i.e. 50 req/minute per each IP, all over get 503 page, which you can customize.
Regarding expected temporary down, in rails world this is done via special maintainance.html page. There is some kind of automation that creates or symlinks that file when rails app servers go down. I'd recommend relying not on file presence, but on actual availability of app server.
But really you are able to start/stop services without losing any connections at all. I.e. you can run separate instance of app server on different UNIX socket/IP port and have balancer (nginx/lighty/haproxy) use that new instance too. Then you shut down old instance and all clients are served with only new one. No connection lost. Of course this scenario is not always possible, depends on type of change you introduced in new version.
haproxy is a balancer-only solution. It can extremely efficiently balance requests to app servers in your farm.
For quite big service you end-up with something like:
api.domain resolving to round-robin N balancers
each balancer proxies requests to M webservers for static and P app servers for dynamic content. Oh well your REST API don't have static files, does it?
For quite small service (under 2K rps) all balancing is done inside one-two webservers.

Good answers already - if you don't want to implement the limiter yourself, there are also solutions like 3scale (http://www.3scale.net) which does rate limiting, analytics etc. for APIs. It works using a plugin (see here for the ruby api plugin) which hooks into the 3scale architecture. You can also use it via varnish and have varnish act as a rate limiting proxy.

I'd recommend implementing the rate limits outside of your application since otherwise the high traffic will still have the effect of killing your app. One good solution is to implement it as part of your apache proxy, with something like mod_evasive

Related

Comparison between service worker and AppCache

What are the core differences between service worker and AppCache. What are the pros and cons of each and when to prefer one over another .
The primary difference is that AppCache is a high-level, declarative API, with which you specify the set of resources you'd like the browser to cache; whereas Service Worker is a low-level, imperative, event-driven API with which you write a script that can intercept fetch events and cache their responses along with doing other things (like displaying push notifications).
The pros and cons are largely a function of API design: theoretically, AppCache is easier to use, while having more limited use cases; whereas Service Worker is harder to use, but is more flexible.
Nevertheless, AppCache is considered hard to use in practice due to poor design (see Application Cache Is A Douchebag for a list of design issues). And it has been deprecated, so it is being removed from browsers (per Using the application cache).
Thus the only reason to prefer AppCache is to offline an app on browsers that don't yet support Service Worker, as Kenneth Ormandy recommends in Don’t Wait for ServiceWorker: Adding Offline Support with One-Line.
Compare Can I use Service Workers? to Can I use Offline web applications? to see the differences in browser support. But note that browsers that support Service Worker, like Chrome and Firefox, are removing support for AppCache, so you'll need to implement both to offline your app across all browsers that support either standard.
In addition of what Myk Melez said, One of the main benefits of Service Workers against Application Cache is that Application Cache only works when user is disconnected from the network, so you can not manage situations of:
1- "slow network" - Your connection signal is strong, however some external entities (server, routes, etc) are delaying the transmission to your specific application.
2- "Lie-fi" (your phone shows is connected to a wi-fi or a cell network with low signal) so it seems to be connected when actually is not.
Service Workers is like a middle ware giving you control over the requests the browser is making, you can actually intercept the request and respond wherever you want, no matter you are connected or not. So you can implement "offline first" principle.

How is SIP scaled for high load?

Basically, I want to implement a VoIP system with sip in a vps server. But it seems that it would not be able to handle more than ~20 simultaneous calls(just bare sip). What are the workarounds to this problem? Can the sip server be just used as a database to tell the clients where to find their intended targets..? Like p2p? I am quite new to sip. Additional info is appreciated.
Your VPS server looks to pretty low-key and when you say it cant handle more than 20 Cps that seems to indicate it topped out on CPU. Correct me if thats not the case.
Options to Scale SIP
Of the Shelf SIP Load balancer - Available in Virtual / Hardware / Opensource and every flavor that you want. It hides a farm of SIP Servers that you have and it can be managed to spread the load accordingly.
Unless the nature of SIP server is defined, it can be difficult to understand the bottlenecks you face and without that its difficult to give a simple solution.
SIP scalability comes from delegating as much work to the endpoints and doing as little on the servers as possible.
What you describe is a "redirect server": it accepts and stores registrations from the endpoints (softphones, hardphones, etc), and responds with "3xx redirect" to incoming calls and forgets about them immediately.
This is probably the most extreme example of server minimization. SIP is a very versatile protocol, it lets you set up your server infrastructure in many different ways with varying degree of control over calls. It lets you trade off features for performance.
Even the flimsiest VPS should be able to handle the signalling for way more than 20 parallel calls even in full "stateful proxy" mode.
Just make sure media (the RTP streams) is not routed through your server. Set up STUN to help firewalled endpoints send media to each other directly.

MassTransit in ASP.NET MVC site?

I'd like to decouple a number of business objects that my website is using to support actions of the users.
My website is a SaaS/B2B site and I do not anticiapte to have a need for "mega scale". My primary issue is a need to decouple business objects from each other, and perform occasional longer-running operations asynchronously - outside of execution of threads that handle user traffic.
Having said that, I really do not want to have a separate set of servers that process my messages, and would prefer for web servers to just host MassTransit or other Bus software) internaly in memory. Assured message delivery (at this point) is also not yet my most important concenrn. I plan to "outsorce" a number of supporting business actions to the bus so that they do not pollute my main business services/objects.
Is this possible? Do I need Loopback for now as a transport or do I need full RabbitMq? Will RabbitMQ require me to install yet another set of servers to host it?
TIA
Loopback is just for testing. Installing RMQ is the right path. You don't NEED different servers for it, but would suggest it. If you off load work to a bus, you don't really want that contending with resources for the website. Given that, you can run RMQ locally without any issue. It message volume is low, so is resource usage in RMQ. When you reacher higher volumes, IO can be a problem with RabbitMQ (or any MQ).

Should an iphone app communicate directly with a cassandra backend?

Obviously there are multiple steps and phases of implementing such a thing.
I was thinking I would eventually have a webserver that takes http json requests from the ios app, and then queries the cassandra backend and sends results back. I could load balance and all that fancy stuff still, and also provide a logical layer on server side, and keep the client app lightweight.
I'm not sure i understand how cassandra clients fit though. It seems like the cassandra objective c client could eliminate the need for the above approach.
I saw another question and answer but it wasnt clear, perhaps because it varys on the need.
An iPhone app should not directly connect to a Cassandra backend or any other DB store.
First of all, talking to a database often requires adapting a very specific binary protocol (for Cassandra in particular, binary CQL or Thrift). Writing an adapter that would let your Objective-C app communicate in this binary protocol is a major piece of work, and could easily cost more than the rest of your app in effort. If you hide the DB behind a web-server, however, you will be able to select from a variety of existing adapters available in different server-side languages, meaning that you don't need to redo all that low-level work. You'll only be responsible for a relatively small piece of server-side code that would translate your REST queries and forward them to one of the Cassandra adapters (which expose easy-to-use interfaces).
Secondly, if you wanted to connect to a remote database from the phone, your database server would have to open its ports to the internet at large, which is a very bad security practice, even if you use SSL and user credentials. Again, if you hide behind a web server, you will be putting in a layer of technology that has evolved for decades to remain secure on the public internet.
Finally, having your phone talk to Cassandra directly is a poor architectural pattern. When you write apps that communicate on the internet, you want them to know as little as possible about each other, only how to talk to each other (preferably in a standard protocol). That way you can replace or upgrade individual components while keeping everything else the same. This may not sound like a lot, but is actually the main reason why phones, or web browsers, don't directly talk to databases. (If this setup were a good idea in principle, the first two problems could be easily solved given enough engineering effort.)
The approach you first suggested with JSON and the web server is the only correct way to go.
Use something like RESTful API, there are many reasons for that.
if your servers ip addresses change you have to update all client, if you add more nodes you will need to update all clients, if you decide to upgrade your cassandra and some functions change your clients will break and you need to update all clients.

How to run two grails apps on the same machine and have them not share a rabbitMQ

I have a grails app running with a single rabbit node. It is great. I want to fire up the same app a second time on the same machine on a different port. Currently, both apps answer jobs from both apps. I want their rabbits to be independent. What is the easiest way to ensure that each app only responds to the messages it sends? Multiple rabbit queues?
You can provide a virtualhost entry in the grails configuration:
rabbitmq.connectionfactory.virtualHost The name of the virtual host to connect to
Define two different vhosts in RabbitMQ, and each grails app will have their very own configured area to use. Messages sent through one vhost will only be available on that vhost, effectively separating the two grails apps without having to change queue setup or other internal parts of each app - just the configuration of the connection.
Remember that access control is performed on a per vhost basis, so you'll have to give your user access to each vhost in rabbitmq.
As #fiskfisk said, multiple vhosts is an option, and would work particularly well if you have a complex set of queues, exchanges, and bindings. There are some downsides to using a new vhost for the second application, including duplication of access control management, as well as some minor performance overhead.
If you have a fairly simple queue/exchange/binding setup, I would suggest pointing the second app at a queue with a different name, or giving your app the ability to be runtime-configured to either use a different queue, or to leverage the topic-based routing within RabbitMQ and have each app flag their messages with an app-specific prefix (or something similar).
One advantage of using topic routing to differentiate apps is that you can easily dip into the full stream of messages and do other things with that stream that you didn't foresee initially, including things like archival logging or audit logging, as well as other metrics collection or analysis.
tl;dr;
For long-term flexibility, have each instance of your application send messages to queues based on topic-routing.
For quick-and-dirty / get-it-working-yesterday, use a separate vhost for each instance of your application.

Resources