Difference of Dockerized Redis performance between server and local system - docker

I conduct stress test on my server the result of latency time for 1000 request per second was 1 second, i found out the latency problem is because of redis so I check dockerized redis performance(benchmark) on CentOS 7 (CentOs is on vmware virtual machine cpu: 22 core, RAM: 30 GB) by
redis-benchmark -q -n 100000
Give me following results:
PING_INLINE: 26420.08 requests per second
PING_BULK: 27389.76 requests per second
SET: 27144.41 requests per second
GET: 26702.27 requests per second
INCR: 27041.64 requests per second
LPUSH: 27203.48 requests per second
RPUSH: 27188.69 requests per second
LPOP: 27005.13 requests per second
RPOP: 27367.27 requests per second
SADD: 26645.35 requests per second
HSET: 26881.72 requests per second
SPOP: 27624.31 requests per second
LPUSH (needed to benchmark LRANGE): 27100.27 requests per second
LRANGE_100 (first 100 elements): 20703.93 requests per second
LRANGE_300 (first 300 elements): 11763.32 requests per second
LRANGE_500 (first 450 elements): 9627.42 requests per second
LRANGE_600 (first 600 elements): 8078.20 requests per second
MSET (10 keys): 26709.40 requests per second
But when I’m checking dockerized redis performance(benchmark) on my laptop which is ubuntu 19.04 (cpu: core i3, RAM: 12GB) by
redis-benchmark -q -n 100000
Give me the following results:
PING_INLINE: 117096.02 requests per second
PING_BULK: 126742.72 requests per second
SET: 119904.08 requests per second
GET: 126903.55 requests per second
INCR: 127064.80 requests per second
LPUSH: 111482.72 requests per second
RPUSH: 121359.23 requests per second
LPOP: 112994.35 requests per second
RPOP: 123152.71 requests per second
SADD: 130378.09 requests per second
HSET: 130039.02 requests per second
SPOP: 103199.18 requests per second
LPUSH (needed to benchmark LRANGE): 88809.95 requests per second
LRANGE_100 (first 100 elements): 51046.45 requests per second
LRANGE_300 (first 300 elements): 17853.96 requests per second
LRANGE_500 (first 450 elements): 12784.45 requests per second
LRANGE_600 (first 600 elements): 9744.69 requests per second
MSET (10 keys): 132802.12 requests per second
Why is the performance result of my local system is much better than server despite the server's higher hardware capabilities?

Related

Getting Locust to send a predefined distribution of requests per second

I previously asked this question about using Locust as the means of delivering a static, repeatable request load to the target server (n requests per second for five minutes, where n is predetermined for each second), and it was determined that it's not readily achievable.
So, I took a step back and reformulated the problem into something that you probably could do using a custom load shape, but I'm not sure how – hence this question.
As in the previous question, we have a 5-minute period of extracted Apache logs, where each second, anywhere from 1 to 36 GET requests were made to an Apache server. From those logs, I can get a distribution of how many times a certain requests-per-second rate appeared; e.g. there's a 1/4000 chance of 36 requests being processed on any given second, 1/50 for 18 requests to be processed on any given second, etc.
I can model the distribution of request rates as a simple Python list: the numbers between 1 and 36 appear in it an equal number of times as 1–36 requests per second were made in the 5-minute period captured in the Apache logs, and then just randomly get a number from it in the tick() method of a custom load shape to get a number that informs the (user count, spawn rate) calculation.
Additionally, by using a predetermined random seed, I can make the test runs repeatable to within an acceptable level of variation to be useful in testing my API server configuration changes, since the same random list elements should be retrieved each time.
The problem is that I'm not yet able to "think in Locust", to think in terms of user counts and spawn rates instead of rates of requests received by the server.
The question becomes this:
How do you implement the tick() method of a custom load shape in such a way that the (user count, spawn rate) tuple results in a roughly known distribution of requests per second to be sent, possibly with the help of other configuration options and plugins?
You need to create a Locust User with the tasks you want it to run (e.g. make your http calls). You can define time between tasks to kind of control the requests per second. If you have a task to make a single http call and define wait_time = constant(1) you can roughly get 1 request per second. Locust's spawn_rate is a per second unit. Since you have the data you want to reproduce already and it's in 1 second intervals, you can then create a LoadTestShape class with the tick() method somewhat like this:
class MyShape(LoadTestShape):
repro_data = […]
last_user_count = 0
def tick(self):
self.last_user_count = requests_per_second
if len(self.repro_data) > 0:
requests_per_second = self.repro_data.pop(0)
requests_per_second_diff = abs(last_user_count - requests_per_second)
return (requests_per_second, requests_per_second_diff)
return None
If your first data point is 10 requests, you'd need requests_per_second=10 and requests_per_second_diff=10 to make Locust spin up all 10 users in a single second. If the next second is 25, you'd have requests_per_second=25 and requests_per_second_diff=15. In a Load Shape, spawn_rate also works for decreasing the number of users. So if next is 16, requests_per_second=16 and requests_per_second_diff=9.

Understand how k6 manages at low level a large number of API call in a short period of time

I'm new with k6 and I'm sorry if I'm asking something naive. I'm trying to understand how that tool manage the network calls under the hood. Is it executing them at the max rate he can ? Is it queuing them based on the System Under Test's response time ?
I need to get that because I'm running a lot of tests using both k6 run and k6 cloud but I can't make more than ~2000 requests per second (looking at k6 results). I was wondering if it is k6 that implement some kind of back-pressure mechanism if it understand that my system is "slow" or if there are some other reasons why I can't overcome that limit.
I read here that is possible to make 300.000 request per second and that the cloud environment is already configured for that. I also try to manually configure my machine but nothing changed.
e.g. The following tests are identical, the only changes is the number of VUs. I run all test on k6 cloud.
Shared parameters:
60 api calls (I have a single http.batch with 60 api calls)
Iterations: 100
Executor: per-vu-iterations
Here I got 547 reqs/s:
VUs: 10 (60.000 calls with an avg response time of 108ms)
Here I got 1.051,67 reqs/s:
VUs: 20 (120.000 calls with an avg response time of 112 ms)
I got 1.794,33 reqs/s:
VUs: 40 (240.000 calls with an avg response time of 134 ms)
Here I got 2.060,33 ​reqs/s:
VUs: 80 (480.000 calls with an avg response time of 238 ms)
Here I got 2.223,33 ​reqs/s:
VUs: 160 (960.000 calls with an avg response time of 479 ms)
Here I got 2.102,83 peak ​reqs/s:
VUs: 200 (1.081.380 calls with an avg response time of 637 ms) // I reach the max duration here, that's why he stop
What I was expecting is that if my system can't handle so much requests I have to see a lot of timeout errors but I haven't see any. What I'm seeing is that all the API calls are executed and no errors is returned. Can anyone help me ?
As k6 - or more specifically, your VUs - execute code synchronously, the amount of throughput you can achieve is fully dependent on how quickly the system you're interacting with responds.
Lets take this script as an example:
import http from 'k6/http';
export default function() {
http.get("https://httpbin.org/delay/1");
}
The endpoint here is purposefully designed to take 1 second to respond. There is no other code in the exported default function. Because each VU will wait for a response (or a timeout) before proceeding past the http.get statement, the maximum amount of throughput for each VU will be a very predictable 1 HTTP request/sec.
Often, response times (and/or errors, like timeouts) will increase as you increase the number of VUs. You will eventually reach a point where adding VUs does not result in higher throughput. In this situation, you've basically established the maximum throughput the System-Under-Test can handle. It simply can't keep up.
The only situation where that might not be the case is when the system running k6 runs out of hardware resources (usually CPU time). This is something that you must always pay attention to.
If you are using k6 OSS, you can scale to as many VUs (concurrent threads) as your system can handle. You could also use http.batch to fire off multiple requests concurrently within each VU (the statement will still block until all responses have been received). This might be slightly less overhead than spinning up additional VUs.

Apache Bench 'Time per Request' decreases with increasing concurrency

I am testing my web server using Apache Bench and I am getting the following responses
Request : ab -n 1000 -c 20 https://www.my-example.com
Time per request: 16.264 [ms] (mean, across all concurrent requests)
Request : ab -n 10000 -c 100 https://www.my-example.com
Time per request: 3.587 [ms] (mean, across all concurrent requests)
Request : ab -n 10000 -c 500 https://www.my-example.com
Time per request: 1.381 [ms] (mean, across all concurrent requests)
The 'Time per request' is decreasing with increasing concurrency. May I know why? Or is this by any chance a bug?
You should be seeing 2 values for Time per request. One of them is [ms] (mean) whereas the other one is [ms] (mean, across all concurrent requests). A concurrency of 20 means that 20 simultaneous requests were sent in a single go and the concurrency was maintained for the duration of the test. The lower value is total_time_taken/total_number_of_requests and it kind of disregards the concurrency aspect whereas the other value is closer to the mean response time (actual response time) you were getting for your requests. I generally visualize it as x concurrent requests being sent in a single batch, and that value is the mean time it took for a batch of concurrent requests to complete. This value will also be closer to your percentiles, which also points to it being the actual time taken by the request.

How to add rate limiter in ruby on rails?

In my ruby on rails application i am facing ceratin performance issues. In certain forms more than 2500 request came from a same ip address at a time.
So i used https://github.com/kickstarter/rack-attack to add rate limiter and track all the request from ip address and track them by storing it in Dynamic table. But for certain interval how can track them (i.e) within 5 seconds how many request came from the same ip address.
But for certain interval how can track them (i.e) within 5 seconds how many request came from the same ip address.
To limit the number to 10 requests every 5 seconds on a per IP basis, you'd use:
# config/initializers/rack_attack.rb
Rack::Attack.throttle('ip limit', limit: 10, period: 5) do |request|
request.ip
end
If a single IP makes more than 10 requests within 5 seconds, it gets a "429 Too Many Requests" response.
Note that Rack Attack uses a "fixed window" approach which allows up to twice as many requests for the given duration. For example, with the above settings you could make 10 requests at the end of one window and another 10 at the beginning of the next, all within 5 seconds (or even less).
You may use Rack::Attack.track and configure it to log the ip address only when certain amount of requests are made.
# Supports optional limit and period, triggers the notification only when the 10 requests are made under 5 seconds from same Ip(configurable).
Rack::Attack.track("Log request", limit: 10, period: 5.seconds) do |req|
req.ip
end
# Track it using ActiveSupport::Notification
ActiveSupport::Notifications.subscribe("track.rack_attack") do |name, start, finish, request_id, payload|
req = payload[:request]
Rails.logger.info "special_agent: #{req.path}"
end

How do I query Prometheus for number of times a service is down

I am trying to work with the UP metric to determine the number of times the service was down for less than a minute (potentially a network hiccup) during a time range (or per hour). I am sampling at 5 seconds intervals
The best I got so far is up == 0 would give me a series with points only when the service was down but I am not sure what to do next.
Any help with this type of query would be greatly appreciated
Thanks.
You might try the following: calculate the average of the up metric. If the service goes down, the average (sliding windows of 1 minute) will decrease over time.
If the job comes up again, and the average is greater than 0, then the service wasn't down for more than one minute.
The following query (works via the Prometheus web console) delivers one data point for each time the service comes up before it was down for more than one minute.
avg_over_time(up{job="jobname"} [1m]) > 0
AND
irate(up{job="jobname"} [1m]) > 0

Resources