I am using Jmeter plugin to show cpu and memory performance related graphs. It's showing me the matrices based on threadgroup setting and rest client call to server using server agent. But I am not able to understand the matrices for the x and Y axis for cpu and memory. Please suggest how it's working.
X axis: X axis value is cumulative value of current test duration , means in below graphs, test started 2 minutes and 5 minutes ago so the values is 00:02:05.
Example 1: If test started 30 minutes ago then x axis value is 00:30:00
Example 2: If test started 20 minutes and 20 seconds ago then x axis values is 00:20:20.
Y axis: Bit tricky , because in below graph "localhost Memory(*100)" utilization is 9000 means actual utilization is 90 % only this value is multiple by 100 (90*100=9000).
"Local host CPU (*1000)" means CPU is actual value (4) multiple by 1000.
The reason for multiple by 100 by Memory and 1000 by CPU is , network utilization is 30000.
In above graph Memory utilization is ~9000(~90*100) in above graph , actual utilization is ~90 only check below graph
Same thing for CPU utilization, check below
Related
i executed a job in Google Cloud Dataflow and now i'm seeing the result on StackDriver. I don't understand the memory chart. I used only 1 and after 3 worker but the scale of this chart is the order of TB to second. it is normal? or maybe the scale is GB? in the metrics of this job, also, in a precise instant that i saw, the value of actual memory was 45 GB, and it isn't in this chart and is much smaller. can someone explain me this chart?
The Total memory usage time is one of the Dataflow metrics used to measure consumption of computing capacity (system memory in this case). This is
The total GB seconds of memory allocated to this Dataflow job.
Customers are billed for the consumed resources accordingly with the established Pricing .
Memory consumption is measured in GB-seconds. 1 GB.s is 1 second of wall clock time with 1GB of memory provisioned. Compute time is measured in 100ms increments, rounded up to the nearest increment.
Since memory usage on the chart is a time-aggregated value, values expressed in TB.s can be converted into GB.h by dividing by 3600 s:
1 GB.h = 3.6 TB.s
The curve shape and Y-coordinate depend on the aggregation and alignment settings you use: max or mean, 1m or 1h alignment period, etc. For instance in case of a short peak load, the wide time window will act as a big denominator for the mean aligner.
Memory usage (measured in GB or TB) and memory usage time (typically measured in GB hr or TB s) are different measurements.
The Dataflow UI gives the following explanation for memory time: "The total running time for all memory used by all workers associated with your job. For example, if your job used 3GB of memory for 4 hours, the total memory time is 12 [GB] hours."
According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
Source: https://prometheus.io/docs/practices/histograms/#quantiles
Since each bucket of histogram is a counter we can calculate rate each of the buckets as:
per-second average rate of increase of the time series in the range vector.
See: https://prometheus.io/docs/prometheus/latest/querying/functions/#rate
So, for instance, if bucket value[t-5m] = 100 and bucket value[t] = 200 then bucket rate[t] = (200-100)/(10*60) = 0.167
And finally, the most confusing part is how can histogram_quantile function find 95th percentile for given metric knowing all the bucket rates?
Is there any code or algorithm I can take a look to better understand it?
A solid example will explain histogram_quantile well.
Assumptions:
ONLY ONE series for simplicity
10 buckets for metric http_request_duration_seconds.
10ms, 50ms, 100ms, 200ms, 300ms, 500ms, 1s, 2s, 3s, 5s
http_request_duration_seconds is a metric type of COUNTER
time
value
delta
rate (quantity of items)
t-10m
50
N/A
N/A
t-5m
100
50
50 / (5*60)
t
200
100
100 / (5*60)
...
...
...
...
We have at least two scrapes of the series covering 5 minutes for rate() to calculate the quantity for each bucket
rate_xxx(t) = (value_xxx[t]-value_xxx[t-5m]) / (5m*60) is the quantity of items for [t-5m, t]
We are looking at 2 samples(value(t) and value(t-5m)) here.
10000 http request durations (items) were recorded, that is,
10000 = rate_10ms(t) + rate_50ms(t) + rate_100ms(t) + ... + rate_5s(t).
bucket(le)
10ms
50ms
100ms
200ms
300ms
500ms
1s
2s
3s
5s
+Inf
range
~10ms
10~50ms
50~100ms
100~200ms
200~300ms
300~500ms
500ms~1s
1~2s
2s~3s
3~5s
5s~
rate_xxx(t)
3000
3000
1500
1000
800
400
200
40
30
5
5
Bucket is the essence of histogram. We just need 10 numbers in rate_xxx(t) to do the quantile calculation
Let's take a close look at this expression (aggregation like sum() is omitted for simplicity)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
We are actually looking for the 95%th item in rate_xxx(t) from bucket=10ms to bucket=+Inf. And 95%th means 9500th here since we got 10000 items in total (10000 * 0.95).
From the table above, there are 9300 = 3000+3000+1500+1000+800 items together before bucket=500ms.
So the 9500th item is the 200th item (9500-9300) in bucket=500ms(range=300~500ms) which got 400 items within
And Prometheus assumes that items in a bucket spread evenly in a linear pattern.
The metric value for the 200th item in bucket=500ms is 400ms = 300+(500-300)*(200/400)
That is, 95% is 400ms.
There are a few to bear in mind
Metric should be COUNTER in nature for histogram metric type
Series for quantile calculation should always get label le defined
Items (Data) in a specific bucket spread evenly a linear pattern (e.g.: 300~500ms)
Prometheus makes this assumption at least
Quantile calculation requires buckets being sorted(defined) in some ascending/descending order (e.g.: 1ms < 5ms < 10ms < ...)
Result of histogram_quantile is an approximation
P.S.:
The metric value is not always accurate due to the assumption of Items (Data) in a specific bucket spread evenly a linear pattern
Say, the max duration in reality (e.g.: from nginx access log) in bucket=500ms(range=300~500ms) is 310ms, however, we will get 400ms from histogram_quantile via above setup which is quite confusing sometimes.
The smaller bucket distance is, the more accurate approximation is.
So setup the bucket distances that fit your needs.
You can refer to my reply here
Actually the rate() function is just used to specify the time window, the denominator has no effect in the computation of the pecentile value.
I believe this is the code for it in prometheus
The general idea is that you use the data in the buckets to extrapolate / approximate the quantiles
Elasticsearch also does something similar (yet different/much simpler) in their rollup capabilities
You have to use reset because counters can be reset, rate automatically considers resets and give you the right count for each second. Just remember that always use rate before using counters.
I'm working with wireless sensor network lead to evaluate its performance in my work. I want to measure the latency and total energy consumption to find the remaining energy in each node. So my problem is that I have some values of tx rx cpu cpu_idle and I don't how to use them to calculate what I need. I found some rules that take the calculation but i don't understand exactly how to apply it in my case.
Energy consumed in communication:
Energy consumed by CPU:
What is the meaning of 32768, and why do we use this number? Is it a standard value?
The powertrace output is printed in timer ticks.
tx - the number of ticks the radio has been in transmit mode (ENERGEST_TYPE_TRANSMIT)
rx - the number of ticks the radio has been in receive mode (ENERGEST_TYPE_LISTEN)
cpu - the number of ticks the CPU has been in active mode (ENERGEST_TYPE_CPU)
cpu_idle - the number of ticks the CPU has been in idle mode (ENERGEST_TYPE_LPM)
The elements of the pair tx and rx are exclusive, as are cpu and idle - the system can never be in both modes at the same time. However, other combinations are possible: it can be in cpu and in tx at the same time, for example. The sum of cpu and idle is the total uptime of the system.
The duration of timer a tick is platform-dependent and defined as the RTIMER_ARCH_SECOND constant. 32768 ticks per second is a typical value of this constant - that's where the number in your equation comes from. For example:
ticks_in_tx_mode = energest_type_time(ENERGEST_TYPE_TRANSMIT);
seconds_in_tx_mode = ticks_in_tx_mode / RTIMER_ARCH_SECOND;
To compute the average current consumption (in milliamperes, mA), multiply each of tx, rx, cpu, cpu_idle with the respective current consumption in that mode in mA (obtain the values from the datasheet of the node), sum them up, and divide by RTIMER_ARCH_SECOND:
current = (tx * current_tx_mode + rx * current_rx_mode + \
cpu * current_cpu + cpu_idle * current_idle) / RTIMER_ARCH_SECOND
To compute the charge (in millicoulumbs, mC), multiply the average current consumption with the duration of the measurement (node's uptime) in seconds:
charge = current * (cpu + cpu_idle) / RTIMER_ARCH_SECOND
To compute the power (in milliwats, mW) multiply the average current consumption with the voltage of the system, for exampe, 3 volts if powered from a pair of AA batteries:
power = current * voltage
Finally, to compute the energy consumption (in millijoules, mJ), multiply the power with the duration in seconds or multiply the charge with the voltage of the system:
energy = charge * voltage
The first formula above computes the energy consumption for communications; the second one: for computation.
This site might be helpful to break down the numbers.
32768 Hz or 32, 768 kHz this is of MSP430F247 Microcontroller frequency, for specifics are Active mode: 32iuA # 3 v / 1 MHz or 1x10 6 Hz and Low Power mode = 1 uA # 3V /32768 Hz
According To This Article about Throughput and Latency H
"When You Go To Buy a Water Pipe, There Are Two Completely Independent Parameters That You Look At: The Diameter of the Pipe and Its Length"
But I Think These Two Parameters Are Related. Throughput Is Measured As Per Unit Time, So A Long Latency Will Affect Throughput, Say, If The Droplet Is Fast, More Of Them Will Pass The Pipe In One Second,
Can Any One Help Me Understand This?
EDIT:
the confusion is originated from counting queuing time as part of latency which we should not. Once a request is handled, the latency is independent of throughput.
Let me give you another anology...Think of a car travelling on a single lane road from location A to location B..time taken by that car to travel from A to B is your latency...and the number of cars travelling at an interval, maintaining the latency is your throughput.
The factors that affect here is your medium of travel ie by road and no of lanes on the road.
You're thinking about frequency. Say you have a window into the water pipe at some given point, and you send water droplets at some constant interval (say 1 droplet ever second). You count how often you see a single droplet pass by, and take the inverse (1/seconds). So if you count 1 second of elapsed time between droplets being observed, then you have a frequency of 1Hz.
Now say that you keep this frequency constant (1Hz), but you elongate the pipe. You send one droplet down and count how much time elapses before it reaches the end of the pipe. So say it takes 2 seconds for a single drop to travel from the beginning to the end of the pipe, then you have a latency of 2 seconds.
Now say that you widen the diameter of the pipe, and now you are able to send 2 droplets with a frequency of 1Hz. At the end of the pipe you will count 2 droplets coming out every second. So your throughput will be 2 droplets per second.
Here is my bit in a language which I can understand
When you go to buy a water pipe, there are two completely independent parameters that you look at: the diameter of the pipe and its length. The diameter determines the throughput of the pipe and the length determines the latency, i.e., the time it will take for a water droplet to travel across the pipe. Key point to note is that the length and diameter are independent, thus, so are are latency and throughput of a communication channel.
More formally, Throughput is defined as the amount of water entering or leaving the pipe every second and latency is the average time required to for a droplet to travel from one end of the pipe to the other.
Let’s do some math:
For simplicity, assume that our pipe is a 4inch x 4inch square and its length is 12inches. Now assume that each water droplet is a 0.1in x 0.1in x 0.1in cube. Thus, in one cross section of the pipe, I will be able to fit 1600 water droplets. Now assume that water droplets travel at a rate of 1 inch/second.
Throughput: Each set of droplets will move into the pipe in 0.1 seconds. Thus, 10 sets will move in 1 second, i.e., 16000 droplets will enter the pipe per second. Note that this is independent of the length of the pipe. Latency: At one inch/second, it will take 12 seconds for droplet A to get from one end of the pipe to the other regardless of pipe’s diameter. Hence the latency will be 12 seconds.
I'm reading this CPU specification: http://ark.intel.com/products/67356/Intel-Core-i7-3612QM-Processor-6M-Cache-up-to-3_10-GHz-rPGA
It says the CPU has 2 channels. So I think it has 2 memory controller inside. Then the max memory bandwidth should be 1.6GHz * 64bits * 2 * 2 = 51.2 GB/s if the supported DDR3 RAM are 1600MHz. But the specification says its max memory bandwidth is 25.6 GB/s.
I multiplied two 2s here, one for the Double Data Rate, another for the memory channel.
Is it the problem of the specification? or I have some miss understanding?
Double data rate memory specs usually already take into account that its effective frequency is doubled. "1600 MHz memory" really runs on 800 Mhz, so you can leave out one factor of 2 from your calculation.