In my ruby on rails application i am facing ceratin performance issues. In certain forms more than 2500 request came from a same ip address at a time.
So i used https://github.com/kickstarter/rack-attack to add rate limiter and track all the request from ip address and track them by storing it in Dynamic table. But for certain interval how can track them (i.e) within 5 seconds how many request came from the same ip address.
But for certain interval how can track them (i.e) within 5 seconds how many request came from the same ip address.
To limit the number to 10 requests every 5 seconds on a per IP basis, you'd use:
# config/initializers/rack_attack.rb
Rack::Attack.throttle('ip limit', limit: 10, period: 5) do |request|
request.ip
end
If a single IP makes more than 10 requests within 5 seconds, it gets a "429 Too Many Requests" response.
Note that Rack Attack uses a "fixed window" approach which allows up to twice as many requests for the given duration. For example, with the above settings you could make 10 requests at the end of one window and another 10 at the beginning of the next, all within 5 seconds (or even less).
You may use Rack::Attack.track and configure it to log the ip address only when certain amount of requests are made.
# Supports optional limit and period, triggers the notification only when the 10 requests are made under 5 seconds from same Ip(configurable).
Rack::Attack.track("Log request", limit: 10, period: 5.seconds) do |req|
req.ip
end
# Track it using ActiveSupport::Notification
ActiveSupport::Notifications.subscribe("track.rack_attack") do |name, start, finish, request_id, payload|
req = payload[:request]
Rails.logger.info "special_agent: #{req.path}"
end
Related
I previously asked this question about using Locust as the means of delivering a static, repeatable request load to the target server (n requests per second for five minutes, where n is predetermined for each second), and it was determined that it's not readily achievable.
So, I took a step back and reformulated the problem into something that you probably could do using a custom load shape, but I'm not sure how – hence this question.
As in the previous question, we have a 5-minute period of extracted Apache logs, where each second, anywhere from 1 to 36 GET requests were made to an Apache server. From those logs, I can get a distribution of how many times a certain requests-per-second rate appeared; e.g. there's a 1/4000 chance of 36 requests being processed on any given second, 1/50 for 18 requests to be processed on any given second, etc.
I can model the distribution of request rates as a simple Python list: the numbers between 1 and 36 appear in it an equal number of times as 1–36 requests per second were made in the 5-minute period captured in the Apache logs, and then just randomly get a number from it in the tick() method of a custom load shape to get a number that informs the (user count, spawn rate) calculation.
Additionally, by using a predetermined random seed, I can make the test runs repeatable to within an acceptable level of variation to be useful in testing my API server configuration changes, since the same random list elements should be retrieved each time.
The problem is that I'm not yet able to "think in Locust", to think in terms of user counts and spawn rates instead of rates of requests received by the server.
The question becomes this:
How do you implement the tick() method of a custom load shape in such a way that the (user count, spawn rate) tuple results in a roughly known distribution of requests per second to be sent, possibly with the help of other configuration options and plugins?
You need to create a Locust User with the tasks you want it to run (e.g. make your http calls). You can define time between tasks to kind of control the requests per second. If you have a task to make a single http call and define wait_time = constant(1) you can roughly get 1 request per second. Locust's spawn_rate is a per second unit. Since you have the data you want to reproduce already and it's in 1 second intervals, you can then create a LoadTestShape class with the tick() method somewhat like this:
class MyShape(LoadTestShape):
repro_data = […]
last_user_count = 0
def tick(self):
self.last_user_count = requests_per_second
if len(self.repro_data) > 0:
requests_per_second = self.repro_data.pop(0)
requests_per_second_diff = abs(last_user_count - requests_per_second)
return (requests_per_second, requests_per_second_diff)
return None
If your first data point is 10 requests, you'd need requests_per_second=10 and requests_per_second_diff=10 to make Locust spin up all 10 users in a single second. If the next second is 25, you'd have requests_per_second=25 and requests_per_second_diff=15. In a Load Shape, spawn_rate also works for decreasing the number of users. So if next is 16, requests_per_second=16 and requests_per_second_diff=9.
I am using Google Ads REST API to pull Ads data. I am not using client library.
One question, how do you programatically check current API usage when calling requests, so you can stop and wait before continuing? Other APIs like Facebook Marketing API has a header in the result that tells you how much requests you have left, so I could stop and wait. Is there a similar info on Google Ads REST API?
Thank you for reading this.
I've seen nothing in the documentation so far to suggest that there is :(
(There is, separately, a RateExceeded error, which includes a retryAfterSeconds field, if you're going too fast / the API is overloaded.)
Ultimately, I tried this method. So far, I haven't reached limit with it:
The basic developer token for Google Ads API allow 15,000 requests per day as of this answer (Link: https://developers.google.com/google-ads/api/docs/access-levels). So that's 15,000 / 24 = 625 requests every hours.
Further divisions show that I can have 625/60 = 10.4 requests every minutes. So 1 request every 6 seconds will ensure I won't reach rate limit.
So my solution is:
Measure the time it takes to complete a request call and subsequent processing
If total time is over 6 seconds, perform the next request. Else, wait so the total time is 6 seconds, then perform the next request.
The below code is what I used to perform this. Hope it helps you guys.
import time
from math import ceil
waiting_seconds = 6
start_time = time.time()
###############PERFORM API REQUEST HERE
#Measure how long it takes, should be at least 6 secs to be under API limit
end_time = time.time()
elapsed = end_time - start_time
if elapsed < waiting_seconds:
remaining = ceil(waiting_seconds - elapsed)
time.sleep(remaining)
I have a SQS consumer running in EventConsumerService that needs to handle up to 3K TPS successfully, sometimes upwards of 20K TPS (or 1.2 million messages per minute). For each message processed, I make a REST call to DataService's TCP VIP. I'm trying to perform a load test to find the max TPS that one host can handle in EventConsumerService without overstraining:
Request volume on dependencies, DynamoDB storage, etc
CPU utilization in both EventConsumerService and DataService
Network connections per host
IO stats due to overlogging
DLQ size must be minimal, currently I am seeing my DLQ growing to 500K messages due to 500 Service Unavailable exceptions thrown from DataService, so something must be wrong.
Approximate age of oldest message. I do not want a message sitting in the queue for over X minutes.
Fatals and latency of the REST call to DataService
Active threads
This is how I am performing the performance test:
I set up both my consumer and the other service on one host, the reason being I want to understand the load on both services per host.
I use a TPS generator to fill the SQS queue with a million messages
The EventConsumerService service is already running in production. Once messages started filling the SQS queue, I immediately could see requests being sent to DataService.
Here are the parameters I am tuning to find messagesPolledPerSecond:
messagesPolledPerSecond = (numberOfHosts * numberOfPollers * messageFetchSize) * (1000/(sleepTimeBetweenPollsPerMs+receiveMessageTimePerMs))
messagesInSurge / messagesPolledPerSecond = ageOfOldestMessageSLA
ageOfOldestMessage + settingsUpdatedLatency < latencySLA
The variables for SqsConsumer which I kept constant are:
numberOfHosts = 1
ReceiveMessageTimePerMs = 60 ms? It's out of my control
Max thread pool size: 300
Other factors are all game:
Number of pollers (default 1), I set to 150
Sleep time between polls (default 100 ms), I set to 0 ms
Sleep time when no messages (default 1000 ms), ???
message fetch size (default 1), I set to 10
However, with the above parameters, I am seeing a high amount of messages being sent to the DLQ due to server errors, so clearly I have set values to be too high. This testing methodology seems highly inefficient, and I am unable to find the optimal TPS that does not cause such a tremendous number of messages to be sent to the DLQ, and does not cause such a high approximate age of the oldest message.
Any guidance is appreciated in how best I should test. It'd be very helpful if we can set up a time to chat. PM me directly
I am working on uber like/cab booking app. I am using Action Cable for this purpose. After creation of new order server get list of 10 nearest drivers and send each in turn order details (with a pause of 40 seconds).
Thread.new do
nearest_drivers.each do |id|
order_data_for_driver = { ... }
ActionCable.server.broadcast("driver_#{id}", order_data_for_driver)
sleep 40
Thread.exit if order.reload.canceled_by_user || order.trip
end
cancel_data = {canceled_by_timeout: true }
ActionCable.server.broadcast("order_#{order.id}", cancel_data )
end
Is there a limit to the number of threads that rails in production mode can run at the same time? For example, if 100 users will create new orders. What more elegant solution can be used?
Usually this kind of tasks is referred as having back pressure. Maximum amount of threads on UNIX systems may vary from circa 10K to max allowed 500K.
The most common way to handle back pressure is to plug in a fast queue in between (like RabbitMQ or something) and increase the amount of queue consumers as the load of requests to proceed increases.
100/s is nothing, but if you plan to handle thousands of concurrent connections, I strongly encourage to rethink the language of choice twice. Rails is not a software created for this kind of task. Neither is Ruby.
I am trying to monitor a post, and plot the number of ups and downs over a 24 hour period (at 5 minute intervals). The core of the code looks like this:
while True:
post = r.get_submission(submission_id='23a1zz')
time.sleep(5)
post.refresh()
print post.ups
time.sleep(5*60)
However, it does not reflect the true ups and downs. It's stuck at the same number even though the actual post is pretty dynamic.
The API Guidelines state that the same resource shouldn't be requested more often than every 30 seconds. This guideline is backed by a cache, on both Reddit and PRAW's end that will return the same content if requested again within a short while. http://praw.readthedocs.org/en/latest/pages/faq.html#i-made-a-change-but-it-doesn-t-seem-to-have-an-effect