I'd like to scale my Azure App Service by the average number of requests (over all instances).
The configuration in the image below should increase the instance count by 1 if there are more than 200 requests per minute.
However,
It doesn't work (it is not scaling)
The text says 1 minute but the graph is grouped by
5 minute intervals?
Does anyone know where I'm going wrong?
If you want to see what is going on behind the scenes, you can turn on logging. It will output data about all the metrics being evaluated and give you some good information on why it isn't deciding to scale.
That being said, it would be good to know how the different settings so that you know what you are looking at in the metrics. I'm going to go in a slightly different order from the portal as I think it will make more sense.
Time Grain (in mins): We need an amount of time that will count as a "bin" for our metric. It wouldn't make sense to evaluate the data every second or millisecond since it would require more overhead than necessary. In this case, we will pull a value every minute. Since you want to look at the number of requests per minute, a one minute bin makes sense.
Time Grain Statistic: How we evaluate the data inside that bin based on the actual sampling interval. If you want to increase the count if there are more than 200 requests a minute this needs to be set to Sum
Duration: How far back we are going to look. You have this set to five minutes, there will be five values to evaluate-- one for each minute. This lets you smooth out sudden spikes.
Time Aggregation This is closer to the top but it is similar to the Time Grain Statistic as it defines how to evaluate the five values it pulled from the "bins". You don't want Total or Count for this type of rule, but Average, Minimum, Maximum, and Last are all good choices depending on how aggressively you want to scale.
Related
Why does Locust does not report RPS as greater than or equal to the user count? As you can see from the images below, despite have 100 users, RPS never reach close to 100.
Furthermore, there seems to be dips in the graph when running with high user count (1 million)
You can reach RPS equal to use count only if response time will be 1 second sharp.
if response time will be 500 ms - you will get 200 RPS
in case if response time will be 2000 ms - you will get 50 RPS
and so on
Check out How do I Correlate the Number of (Concurrent) Users with Hits Per Second for more comprehensive explanation if needed.
If you want to conduct the load of 100 RPS you can take a look at Locust issue 646 and choose the workaround you like the most.
In addition to response time Dmitri mentioned, your code will also play a factor in the RPS you'll be able to hit. wait_time in particular can limit RPS by increasing the amount of time between one user finishing their tasks and another one being spawned to replace it.
This answer has more details about wait_time's effect on response time but the majority of that will also apply here to you trying to hit an RPS target.
For your second graph, the dips you mentioned and the wild swings in RPS, general downward trend in RPS, and upward trend in response time are most likely mostly due to the system you're testing being unable to consistently handle the load you're throwing at it, with a bit of overloading your workers thrown in for good measure, especially at the higher end of the user count. Depending on your code, Locust may not be able to generate the 250,000 users you're wanting. Looks like it's possible Locust started falling behind after you hit 50,000 users. Each worker may only be able to easily maintain 10,000 users. You may need to make some changes to your code or increase the number of workers you're using to get better performance. See the Locust FAQ for more details.
I'm having some difficulties thinking about a good way to assess the scalability of my school project. The assignment is simply put a twitter-service. Very bare bones. The main goal is to make it as scalable as possible. However, it's equally important to know how and why the assignment would scale. So the point is not really to create a very scalable project, but mainly to create a few different architectures for the server and see which one outperforms which one and why.
So so far I have the basic server architecture like this:
* 1 main process which holds the data
* 1 unique process server-side per user
A user sends messages to his own server-side process, which then simply delegates those messages to the central process.
To test this I would spawn 1 or more processes which would act as clients. I would spam the server with tweets and then assess how well it can withstand a certain load.
Now, to assess the scalability I came up with following metrics:
First off, a process is being a bottleneck if it's message queue is piling up. So, I would store the queuelength every time a tweet is processed (or every N tweets) and in the end calculate the average. If I run it on more cores and the average queue length goes down, it scales better.
Second, if I create N users to spam the server (on N processes or less) I simply time how long it takes for the server to process all these tweets.
Is there a better way to do this? I can't stop thinking that there should be better metrics..
Edt:
So far I have tried fprof and eprof. These tools however, show me how much time is spent in certain methods. While this is a good indicator where I can improve my code, it's not really a good indicator for scalability. It would be better if it would for example show the time spent per process.
Look at percept and percept2 if you are really interested in this.
I am creating an app which graphs the total number of accepted points on an iteration by iteration basis, compared to all points accepted within that iteration (regardless of project). Currently, I am using a WsapiDataStore call with filters to only pull from the chosen iterations. However, this requires pulling all user stories within the iteration and then summing the Plan Estimate fields of each. It works, but it takes a pretty long time (about 20-30 seconds) to pull data which I would assume might be able to be queried in a single call. Am I correct in my thinking, or is this really the easiest way?
Rally's API does not support server side aggregations. Unfortunately pulling that data into local memory is the only way to do calculations like this.
How would you update attributes in your database based on the time of day or what day it is. I have three attributes energy, hunger, and happiness that I want to decrease by ten every hour but I don't quite know how to go about doing this. I know there are timestamps in the database but I don't really know how to use them. Also I want to change the players skills every day based on their job. So if you have this job, add 2 to intelligence every day. But I don't know how to add that 2 every day. I would love it if anyone could give me help on this problem. I would greatly appreciate it.
A couple of options:
cronjob: You could setup your cronjob to access the database directly through a SQL script (probably the simplest solution out of all in terms of setup) or go through your rails application first (e.g. in case you need to run additional business logic before updating the database - you mentioned something about updating the database based on the user job). See this post for the latter approach.
Background task: Take a look at Starling/Workling or Backgroundrb. You can use either of these to run a background task that could update your database at regular intervals.
There are two common but fundamentally different ways of achieving this:
During each request, simulate the amount of time which has passed since the last request. If a user makes two requests three hours apart, simulate three hours of time passing by subtracting 30 happiness (10/hour times 3 hours) all at once. This is less resource intensive, but requires a little more thinking on your part. It's not difficult for something as simple as "lower a value by 10 every hour", but more complex interactions are more difficult to model.
Run a cron job which invokes an action in your program every hour, on the hour, to deduct 10 happiness from each account. This is easier conceptually, but involves a lot of overhead if you have many users, especially when some of them are idle for long periods.
I have a site with several pages for each company and I want to show how their page is performing in terms of number of people coming to this profile.
We have already made sure that bots are excluded.
Currently, we are recording each hit in a DB with either insert (for the first request in a day to a profile) or update (for the following requests in a day to a profile). But, given that requests have gone from few thousands per days to tens of thousands per day, these inserts/updates are causing major performance issues.
Assuming no JS solution, what will be the best way to handle this?
I am using Ruby on Rails, MySQL, Memcache, Apache, HaProxy for running overall show.
Any help will be much appreciated.
Thx
http://www.scribd.com/doc/49575/Scaling-Rails-Presentation-From-Scribd-Launch
you should start reading from slide 17.
i think the performance isnt a problem, if it's possible to build solution like this for website as big as scribd.
Here are 4 ways to address this, from easy estimates to complex and accurate:
Track only a percentage (10% or 1%) of users, then multiply to get an estimate of the count.
After the first 50 counts for a given page, start updating the count 1/13th of the time by a count of 13. This helps if it's a few page doing many counts while keeping small counts accurate. (use 13 as it's hard to notice that the incr isn't 1).
Save exact counts in a cache layer like memcache or local server memory and save them all to disk when they hit 10 counts or have been in the cache for a certain amount of time.
Build a separate counting layer that 1) always has the current count available in memory, 2) persists the count to it's own tables/database, 3) has calls that adjust both places