Locust RPS does not match user count - load-testing

Why does Locust does not report RPS as greater than or equal to the user count? As you can see from the images below, despite have 100 users, RPS never reach close to 100.
Furthermore, there seems to be dips in the graph when running with high user count (1 million)

You can reach RPS equal to use count only if response time will be 1 second sharp.
if response time will be 500 ms - you will get 200 RPS
in case if response time will be 2000 ms - you will get 50 RPS
and so on
Check out How do I Correlate the Number of (Concurrent) Users with Hits Per Second for more comprehensive explanation if needed.
If you want to conduct the load of 100 RPS you can take a look at Locust issue 646 and choose the workaround you like the most.

In addition to response time Dmitri mentioned, your code will also play a factor in the RPS you'll be able to hit. wait_time in particular can limit RPS by increasing the amount of time between one user finishing their tasks and another one being spawned to replace it.
This answer has more details about wait_time's effect on response time but the majority of that will also apply here to you trying to hit an RPS target.
For your second graph, the dips you mentioned and the wild swings in RPS, general downward trend in RPS, and upward trend in response time are most likely mostly due to the system you're testing being unable to consistently handle the load you're throwing at it, with a bit of overloading your workers thrown in for good measure, especially at the higher end of the user count. Depending on your code, Locust may not be able to generate the 250,000 users you're wanting. Looks like it's possible Locust started falling behind after you hit 50,000 users. Each worker may only be able to easily maintain 10,000 users. You may need to make some changes to your code or increase the number of workers you're using to get better performance. See the Locust FAQ for more details.

Related

How can I solve the "Read Request Limit Error"?

I receive this error while trying to export form my datagrid to Google Sheets. How can I solve it?
Don't make many requests too quickly.
You are either exceeding your quota or you are making too many requests too quickly.
Also, look into batch requests
https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/batchUpdate
As you may be trying to make a call to the API for every single cell updated, which is an easy way to run into the above error.
If you must do it on a cell by cell basis, you would have to insert a small delay between requests. Bear in mind that although the usage page says:
This version of the Google Sheets API has a limit of 500 requests per 100 seconds per project, and 100 requests per 100 seconds per user. Limits for reads and writes are tracked separately. There is no daily usage limit.
This does not mean that you can make 100 requests in 1 second and then wait 99 seconds. This will give you a quota error like what you are running into. You would have to put in a one second delay between requests, for example.

What's the most efficient way to handle quota for the YouTube Data API when developing a chat bot?

I'm currently developing a chat bot for one specific YouTube channel, which can already fetch messages from the currently active livechat. However I noticed my quota usage shooting up, so I took the "liberty" to calculate my quota cost.
My API call currently looks like this https://www.googleapis.com/youtube/v3/liveChat/messages?liveChatId=some_livechat_id&part=snippet,authorDetails&pageToken=pageTokenIfProvided, which uses up 5 units. I checked this by running one API call and comparing the quota usage before and after (so apologies, if this is inaccurate). The response contains pollingIntervalMillis set to 5086 milliseconds. Currently, my bot adds that interval to the current datetime and schedules the next fetch at that time (using Celery), so it currently fetches messages at a rate of 4-6 seconds. I'm gonna take the liberty and always wait for 6 seconds.
Calculating my API quota would result in a usage of 72.000 units per day:
10 requests per minute * 60 minutes * 24 hours = 14.400 requests per day
14.400 requests * 5 units per request = 72.000 units per day
This means that if I used the pollingIntervalMillis as a guideline for how often to request, I'd easily reach the maximum quota of 10.000 units by running the bot for 3 hours and 20 minutes. In order to not use up the quota by just fetching chat messages, I would need to run 1 API call per minute (1,3889 approximately). This is very unfeasible for a chatbot, since this is only for fetching messages and not even sending any messages to the chat.
So my question is: Is there maybe a more efficient way to fetch chat messages which won't use up the quota so much? Or will I only get this resolved by applying for a quota extension? And if this is only resolved by a quota extension, how much would I need to ask for reliably? Around 100k units? Even more?
I am also asking myself how something like Streamlabs Chatbot (previously known as AnkhBot) accomplishes this without hitting the quota limit despite thousands of users using their API client, their quota must probably be in the millions or billions.
And another question would be how I'd actually fill out the form, if the bot is still in this "early" state of development?
You pretty much hit the nail on the head. Services like Streamlabs are owned by larger companies, in their case Logitech. They not only have the money to throw around for things like increasing their API quota, but they also have professional relationships with companies like Google to decrease their per unit cost.
As for efficiency, the API costs are easily found in the documentation, but for live chat as you've found, you're going to be hitting the API for 5 units per hit. The only way to improve your overall daily cost with your calls is to perform them less frequently. While once per minute is clearly excessively long, once every 15-18 seconds could reduce the overall cost of your API quota increase, while making the chat bot adequately responsive.
Of course that all depends on your desired usage of the data, but still a recommendation if you're implementing the bot still in the realm of hobbyist usage.

Cannot scale Azure App Service by number of requests

I'd like to scale my Azure App Service by the average number of requests (over all instances).
The configuration in the image below should increase the instance count by 1 if there are more than 200 requests per minute.
However,
It doesn't work (it is not scaling)
The text says 1 minute but the graph is grouped by
5 minute intervals?
Does anyone know where I'm going wrong?
If you want to see what is going on behind the scenes, you can turn on logging. It will output data about all the metrics being evaluated and give you some good information on why it isn't deciding to scale.
That being said, it would be good to know how the different settings so that you know what you are looking at in the metrics. I'm going to go in a slightly different order from the portal as I think it will make more sense.
Time Grain (in mins): We need an amount of time that will count as a "bin" for our metric. It wouldn't make sense to evaluate the data every second or millisecond since it would require more overhead than necessary. In this case, we will pull a value every minute. Since you want to look at the number of requests per minute, a one minute bin makes sense.
Time Grain Statistic: How we evaluate the data inside that bin based on the actual sampling interval. If you want to increase the count if there are more than 200 requests a minute this needs to be set to Sum
Duration: How far back we are going to look. You have this set to five minutes, there will be five values to evaluate-- one for each minute. This lets you smooth out sudden spikes.
Time Aggregation This is closer to the top but it is similar to the Time Grain Statistic as it defines how to evaluate the five values it pulled from the "bins". You don't want Total or Count for this type of rule, but Average, Minimum, Maximum, and Last are all good choices depending on how aggressively you want to scale.

How does Twitter Search API rate limit work?

I am not clear about what the Twitter rate limit, "350 requests per hour per access token/user", means. How they are limiting the request? In 1 request how much data i can get?
The rate limits are based on request, not the amount of data (e.g. bytes) you receive. With that in mind, you can maximize requests by using the available parameters of the particular endpoint you're calling. I'll give you a couple examples to explain what I mean:
One way is to set count, if supported, to the highest available value. On statuses/home_timeline, you can max out count at 200. If you aren't using it now, you're getting the default of 20, which means that you would (theoretically) need to do 10 more queries the get the same amount of data. More queries mean you eat up rate limit.
Using statuses/home_timeline again, notice that you can page through data using since_id and max_id, as described in Working with Timelines. Essentially, you keep track of the tweets you already requested so you can save on requests by only getting the newest tweets.
Rate limits are in 15 minute windows, so you can pace your requests to minimize the chance of running out in any give time window.
Use a combination of Streams and requests, which increases your rate limit.
There are more optimizations like this that help you save request limit, some more subtle than others. Looking at the rate limits per API, studying parameters, and thinking about how the API is used can help you minimize rate limit usage.

Tracking impressions/visits per web page

I have a site with several pages for each company and I want to show how their page is performing in terms of number of people coming to this profile.
We have already made sure that bots are excluded.
Currently, we are recording each hit in a DB with either insert (for the first request in a day to a profile) or update (for the following requests in a day to a profile). But, given that requests have gone from few thousands per days to tens of thousands per day, these inserts/updates are causing major performance issues.
Assuming no JS solution, what will be the best way to handle this?
I am using Ruby on Rails, MySQL, Memcache, Apache, HaProxy for running overall show.
Any help will be much appreciated.
Thx
http://www.scribd.com/doc/49575/Scaling-Rails-Presentation-From-Scribd-Launch
you should start reading from slide 17.
i think the performance isnt a problem, if it's possible to build solution like this for website as big as scribd.
Here are 4 ways to address this, from easy estimates to complex and accurate:
Track only a percentage (10% or 1%) of users, then multiply to get an estimate of the count.
After the first 50 counts for a given page, start updating the count 1/13th of the time by a count of 13. This helps if it's a few page doing many counts while keeping small counts accurate. (use 13 as it's hard to notice that the incr isn't 1).
Save exact counts in a cache layer like memcache or local server memory and save them all to disk when they hit 10 counts or have been in the cache for a certain amount of time.
Build a separate counting layer that 1) always has the current count available in memory, 2) persists the count to it's own tables/database, 3) has calls that adjust both places

Resources