Nginx rate limiting for unique IPs - ruby-on-rails

we've been dealing with constant attacks on our authentication url, we're talking millions of requests per day, my guess is they are trying to brute force passwords.
Whenever we would block the IP with the server firewall, few seconds later the attacks would start again from a different IP.
we ended up implementing a combination of throttling through rack-attack plus custom code to dynamically block the IPs in the firewall. But as we improved our software's security, so did the attackers, and now we are seeing every request they make is done from a different IP, one call per IP, still several per seconds, not as many but still an issue.
Now i'm trying to figure out what else can i do to prevent this, we tried recaptcha but quickly ran out of the monthly quota and then nobody can login.
I'm looking into Nginx rate limiter but from what I can see it also uses the IP, considering they now rotate IPs for each request, is there a way that this would work?
Any other suggestions on how to handle this, maybe one of you went through the same thing?
Stack: Nginx and Rails 4, Ubuntu 16.

For your condition, the most effective way to prevent this attack is using captcha. You mentioned that you have used recaptcha, and that can be run out of soon, but if you develeop the captcha yourself and you would have unlimited captcha images.
As for the other prevent method, like lock the IPs, this is always useless when the attackers use IP pool, there are so many IPs(including some IoT devices' IPs) that you can not identify/lock them all even if you use the commercial Threat Intelligence Pool.
So the suggestion like this
Develop the captcha yourself,and implement this on your api,
Identify and lock the IPs that you think malicious
Set some rules to identify the UA and Cookie of the http request (usually the normal request is deferent from the attack)
Use WAF (if you have enough budget)

Related

Is the Broker able to Block unwanted topic spammers?

I have a MQTT environment like this:
there is One (gray) sensor and one Observer that are related by the topic room/temp, so far so good, sensor can publish and the Observer can get the info as it should.
the Issue I have is now: I need to block IN THE BROKER that a 2nd undesired client comes(the orange one),and start to publish into the same topic, as far as I know, MQTT is loose coupled so that observer doesn't care who is pushing the temp values, but I find a security flawless when someone hack my environment and publish non sense triggering my alarms...
any suggestion?
am using eMQTTd by the way and according to this there is nothing in the etc/emqttd.config file I can do to avoid that...
Thanks!
I only have experience with Mosquitto but, from a quick read of the document linked, it looks like there are several ways you could achieve this.
I am unclear if you are talking about an incidental problem here--i.e. bad information is being accidentally sent--or if you are protecting against an active threat.
If you are concerned with incidental overwriting of a value, then the simple clientid solution on (pg. 38) would work.
But my impression is that it would still be transmitted in the clear and thus be of little use to you if you are facing an actual adversary (hacker etc.). If that is your concern simply setup SSL and remove all non-SSL listeners. (See pg. 24). That should limit all traffic to an encrypted channel. Then if you wish add password / user authentication (pg. 38) to complete the security.
Alternatively, depending on your configuration, you could block unapproved ip addresses at the firewall level (i.e. block access to the port that your broker is listening on to all addresses except for the temperature sensor) or using eMQTTd's built in ACL facility (pg. 25). That would be less secure than a full SSL setup but depending upon your needs it might be enough.

So many persistent connections to the server. Is that the right way?

I would like to understand networking services with a large user base a bit better so that I know how to approach a project I am busy with.
The following statements that I make may be incorrect but they still lead to the question that I want to ask...
Please consider Skype and TeamViewer clients. It seems that both keep persistent network connections open to their respective servers. They use these persistent connections to initiate additional connections. Some of these connections are created by means of Hole Punching if the clients are behind NATs. They are then used for direct Peer-to-Peer communications.
Now according to http://expandedramblings.com/index.php/skype-statistics/ there are 300 million users using Skype and 4.9 million daily active users. I would assume that most of that 4.9 million users will most probably have their client apps running most of the day. That is a lot of connections to the Skype servers that are open at any given time.
So to my question; Is this feasible or at least acceptable? I mean, wouldn't it be better to not have a network connection open while idle and aspecially when there are so many connections open to the servers at once? The only reason I can think is that it would be the only way to properly do Hole Punching. Techically, how is this achieved on the server side?
Is this feasible or at least acceptable?
Feasible it certainly is, you mention already two popular apps that do it, so it is very doable in practice.
As for acceptable, to start no internet authority (e.g. IETF) has ever said it is unacceptable to have long-lived connections even with low traffic.
Furthermore, the only components for which this matters are network elements that keep connection/flow state. These are for sure the endpoints and so-called middleboxes like NAT and firewalls. For the client this is only one connection, the server is usually fine tuned by the application developers (who made this choice) themselves, so for these it is acceptable. For middleboxes it's simple: they have no choice, they're designed to just work with all kind of flows, including long-lived persistent connections.
I mean, wouldn't it be better to not have a network connection open while idle and aspecially when there are so many connections open to the servers at once?
Not at all. First of all, that could be 'much' slower as you'd need to set up a full connection before each control-plane call. This is especially noticeable if your RTT is big or if the servers do some complicated connection proxying/redirection for load-balancing/localization purposes.
Next to that this would historically make incoming calls difficult for a huge amount of users. Many ISP's block/blocked unknown incoming connections from the internet by means of a firewall. Similar, if you are behind a NAT device that does not support UPnP or PCP you can't open a port to listen on for your public IP address. So you need it even aside from hole-punching.
The only reason I can think is that it would be the only way to
properly do Hole Punching. Techically, how is this achieved on the
server side?
Technically you can't do proper hole-punching as soon as the NAT devices maintain a full <src-ip,src-port,dest-ip,dest-port,protocol> (classical 5-tuple) flow match. Then the best you can do with 'hole punching' is set up a proxy between peers.
What hole-punching relies on is that the NAT flow lookup is only looking at <src-ip,src-port,protocol> upstream and <dest-ip,dest-port,protocol> downstream to do the translation. In that case both clients just set up a connection to the server, their ip and port gets translated and the server passes this to the other client. The other client can now start sending packets to that translated <ip,port> combination which should work because NAT ignores the server's ip/port. But even if the particular NAT would work like this, some security device (e.g. stateful firewall) might detect session hi-jacking and drop this anyway.
Nowadays you rather use UPnP to open up a port to listen on your public IP which is much easier if supported.

How is SIP scaled for high load?

Basically, I want to implement a VoIP system with sip in a vps server. But it seems that it would not be able to handle more than ~20 simultaneous calls(just bare sip). What are the workarounds to this problem? Can the sip server be just used as a database to tell the clients where to find their intended targets..? Like p2p? I am quite new to sip. Additional info is appreciated.
Your VPS server looks to pretty low-key and when you say it cant handle more than 20 Cps that seems to indicate it topped out on CPU. Correct me if thats not the case.
Options to Scale SIP
Of the Shelf SIP Load balancer - Available in Virtual / Hardware / Opensource and every flavor that you want. It hides a farm of SIP Servers that you have and it can be managed to spread the load accordingly.
Unless the nature of SIP server is defined, it can be difficult to understand the bottlenecks you face and without that its difficult to give a simple solution.
SIP scalability comes from delegating as much work to the endpoints and doing as little on the servers as possible.
What you describe is a "redirect server": it accepts and stores registrations from the endpoints (softphones, hardphones, etc), and responds with "3xx redirect" to incoming calls and forgets about them immediately.
This is probably the most extreme example of server minimization. SIP is a very versatile protocol, it lets you set up your server infrastructure in many different ways with varying degree of control over calls. It lets you trade off features for performance.
Even the flimsiest VPS should be able to handle the signalling for way more than 20 parallel calls even in full "stateful proxy" mode.
Just make sure media (the RTP streams) is not routed through your server. Set up STUN to help firewalled endpoints send media to each other directly.

reading from multiple imap.gmail.com from the same fetchmail client

For my portfolio software I have been using fetchmail to read from a Google email account over IMAP and life has been great. Thanks to the miracle of idle connection supported by imap3, my triggers fire in near-realtime due to server push, much sooner than periodic polling would allow otherwise.
In my basic .fetchmailrc setup, in which a brokerage customer's account emails trade notifications to a dedicated Gmail/Google Apps box, I've had
poll imap.gmail.com proto imap user "youraddress#yourdomain-OR-gmail.com" pass "yoMama" keep nofetchall ssl idle mimedecode limit 29000 no rewrite mda "myCustomSpecialMDAhandler.sh %F %T"
Trouble is, now I need to support reading from multiple email boxes, and hand off the emails to other specialized MDA scripts I wrote. No problem, just add more poll lines to .fetchmailrc, right? Well that doesn't work when the other accounts also use imap.gmail.com. What ends up happening is that while one account reads fine (and not necessary the first one listed, though usually yes), the other is getting "socket error" all day and the emails remain untouched, unread. I can't figure out why and not even sure if it's some mechanism at imap.gmail.com or not, eg. limiting to one IMAP connection from a host. That doesn't seem right since I have kept IMAP connections to many separate Gmail & Google Apps accounts from the same client for years (like Thunderbird) and never noticed this exclusivity problem.
I haven't tried launching multiple fetchmail daemons using separate -f config files (assuming they wouldn't conflict), or deploying one or more getmail and other similar email fetchers in addition. Still trying to avoid that kind of mess--unscaleable the more boxes I have to monitor.
Don't have the reference offhand but somewhere in fetchmail's docs I recall reading that idle is not so much an imap feature as a fetchmail optional trick, which has a (nasty for me) side effect of choking off all other defined accounts from polling until the connection is cut off by some external event or timeout. So at least that would vindicate Google.
Credit to Carl's Whine Rack blog for some tips.
For now I use killall fetchmail; fetchmail -f fetcher.$[$RANDOM % $numaccounts].rc periodically from crontab to cycle reading accounts each defined individually in fetcher.1.rc, fetcher.2.rc, etc. Acceptable while email events are relatively infrequent.

Best practice for rate limiting users of a REST API?

I am putting together a REST API and as I'm unsure how it will scale or what the demand for it will be, I'd like to be able to rate limit uses of it as well as to be able to temporarily refuse requests when the box is over capacity or if there is some kind of slashdotted scenario.
I'd also like to be able to gracefully bring the service down temporarily (while giving clients results that indicate the main service is offline for a bit) when/if I need to scale the service by adding more capacity.
Are there any best practices for this kind of thing? Implementation is Rails with mysql.
This is all done with outer webserver, which listens to the world (i recommend nginx or lighttpd).
Regarding rate limits, nginx is able to limit, i.e. 50 req/minute per each IP, all over get 503 page, which you can customize.
Regarding expected temporary down, in rails world this is done via special maintainance.html page. There is some kind of automation that creates or symlinks that file when rails app servers go down. I'd recommend relying not on file presence, but on actual availability of app server.
But really you are able to start/stop services without losing any connections at all. I.e. you can run separate instance of app server on different UNIX socket/IP port and have balancer (nginx/lighty/haproxy) use that new instance too. Then you shut down old instance and all clients are served with only new one. No connection lost. Of course this scenario is not always possible, depends on type of change you introduced in new version.
haproxy is a balancer-only solution. It can extremely efficiently balance requests to app servers in your farm.
For quite big service you end-up with something like:
api.domain resolving to round-robin N balancers
each balancer proxies requests to M webservers for static and P app servers for dynamic content. Oh well your REST API don't have static files, does it?
For quite small service (under 2K rps) all balancing is done inside one-two webservers.
Good answers already - if you don't want to implement the limiter yourself, there are also solutions like 3scale (http://www.3scale.net) which does rate limiting, analytics etc. for APIs. It works using a plugin (see here for the ruby api plugin) which hooks into the 3scale architecture. You can also use it via varnish and have varnish act as a rate limiting proxy.
I'd recommend implementing the rate limits outside of your application since otherwise the high traffic will still have the effect of killing your app. One good solution is to implement it as part of your apache proxy, with something like mod_evasive

Resources