strange chart of memory usage in GCD - google-cloud-dataflow

i executed a job in Google Cloud Dataflow and now i'm seeing the result on StackDriver. I don't understand the memory chart. I used only 1 and after 3 worker but the scale of this chart is the order of TB to second. it is normal? or maybe the scale is GB? in the metrics of this job, also, in a precise instant that i saw, the value of actual memory was 45 GB, and it isn't in this chart and is much smaller. can someone explain me this chart?

The Total memory usage time is one of the Dataflow metrics used to measure consumption of computing capacity (system memory in this case). This is
The total GB seconds of memory allocated to this Dataflow job.
Customers are billed for the consumed resources accordingly with the established Pricing .
Memory consumption is measured in GB-seconds. 1 GB.s is 1 second of wall clock time with 1GB of memory provisioned. Compute time is measured in 100ms increments, rounded up to the nearest increment.
Since memory usage on the chart is a time-aggregated value, values expressed in TB.s can be converted into GB.h by dividing by 3600 s:
1 GB.h = 3.6 TB.s
The curve shape and Y-coordinate depend on the aggregation and alignment settings you use: max or mean, 1m or 1h alignment period, etc. For instance in case of a short peak load, the wide time window will act as a big denominator for the mean aligner.

Memory usage (measured in GB or TB) and memory usage time (typically measured in GB hr or TB s) are different measurements.
The Dataflow UI gives the following explanation for memory time: "The total running time for all memory used by all workers associated with your job. For example, if your job used 3GB of memory for 4 hours, the total memory time is 12 [GB] hours."

Related

Jmeter performance metrices using plugin

I am using Jmeter plugin to show cpu and memory performance related graphs. It's showing me the matrices based on threadgroup setting and rest client call to server using server agent. But I am not able to understand the matrices for the x and Y axis for cpu and memory. Please suggest how it's working.
X axis: X axis value is cumulative value of current test duration , means in below graphs, test started 2 minutes and 5 minutes ago so the values is 00:02:05.
Example 1: If test started 30 minutes ago then x axis value is 00:30:00
Example 2: If test started 20 minutes and 20 seconds ago then x axis values is 00:20:20.
Y axis: Bit tricky , because in below graph "localhost Memory(*100)" utilization is 9000 means actual utilization is 90 % only this value is multiple by 100 (90*100=9000).
"Local host CPU (*1000)" means CPU is actual value (4) multiple by 1000.
The reason for multiple by 100 by Memory and 1000 by CPU is , network utilization is 30000.
In above graph Memory utilization is ~9000(~90*100) in above graph , actual utilization is ~90 only check below graph
Same thing for CPU utilization, check below

Throughput vs. Latency Confusion

According To This Article about Throughput and Latency H
"When You Go To Buy a Water Pipe, There Are Two Completely Independent Parameters That You Look At: The Diameter of the Pipe and Its Length"
But I Think These Two Parameters Are Related. Throughput Is Measured As Per Unit Time, So A Long Latency Will Affect Throughput, Say, If The Droplet Is Fast, More Of Them Will Pass The Pipe In One Second,
Can Any One Help Me Understand This?
EDIT:
the confusion is originated from counting queuing time as part of latency which we should not. Once a request is handled, the latency is independent of throughput.
Let me give you another anology...Think of a car travelling on a single lane road from location A to location B..time taken by that car to travel from A to B is your latency...and the number of cars travelling at an interval, maintaining the latency is your throughput.
The factors that affect here is your medium of travel ie by road and no of lanes on the road.
You're thinking about frequency. Say you have a window into the water pipe at some given point, and you send water droplets at some constant interval (say 1 droplet ever second). You count how often you see a single droplet pass by, and take the inverse (1/seconds). So if you count 1 second of elapsed time between droplets being observed, then you have a frequency of 1Hz.
Now say that you keep this frequency constant (1Hz), but you elongate the pipe. You send one droplet down and count how much time elapses before it reaches the end of the pipe. So say it takes 2 seconds for a single drop to travel from the beginning to the end of the pipe, then you have a latency of 2 seconds.
Now say that you widen the diameter of the pipe, and now you are able to send 2 droplets with a frequency of 1Hz. At the end of the pipe you will count 2 droplets coming out every second. So your throughput will be 2 droplets per second.
Here is my bit in a language which I can understand
When you go to buy a water pipe, there are two completely independent parameters that you look at: the diameter of the pipe and its length. The diameter determines the throughput of the pipe and the length determines the latency, i.e., the time it will take for a water droplet to travel across the pipe. Key point to note is that the length and diameter are independent, thus, so are are latency and throughput of a communication channel.
More formally, Throughput is defined as the amount of water entering or leaving the pipe every second and latency is the average time required to for a droplet to travel from one end of the pipe to the other.
Let’s do some math:
For simplicity, assume that our pipe is a 4inch x 4inch square and its length is 12inches. Now assume that each water droplet is a 0.1in x 0.1in x 0.1in cube. Thus, in one cross section of the pipe, I will be able to fit 1600 water droplets. Now assume that water droplets travel at a rate of 1 inch/second.
Throughput: Each set of droplets will move into the pipe in 0.1 seconds. Thus, 10 sets will move in 1 second, i.e., 16000 droplets will enter the pipe per second. Note that this is independent of the length of the pipe. Latency: At one inch/second, it will take 12 seconds for droplet A to get from one end of the pipe to the other regardless of pipe’s diameter. Hence the latency will be 12 seconds.

Compute max memory bandwidth of a processor

I'm reading this CPU specification: http://ark.intel.com/products/67356/Intel-Core-i7-3612QM-Processor-6M-Cache-up-to-3_10-GHz-rPGA
It says the CPU has 2 channels. So I think it has 2 memory controller inside. Then the max memory bandwidth should be 1.6GHz * 64bits * 2 * 2 = 51.2 GB/s if the supported DDR3 RAM are 1600MHz. But the specification says its max memory bandwidth is 25.6 GB/s.
I multiplied two 2s here, one for the Double Data Rate, another for the memory channel.
Is it the problem of the specification? or I have some miss understanding?
Double data rate memory specs usually already take into account that its effective frequency is doubled. "1600 MHz memory" really runs on 800 Mhz, so you can leave out one factor of 2 from your calculation.

IOPS versus Throughput

What is the key difference between IOPS and Throughput in large data storage?
Does file size have an effect on IOPS? Why?
IOPS measures the number of read and write operations per second, while throughput measures the number of bits read or written per second.
Although they measure different things, they generally follow each other as IO operations have about the same size.
If you have large files, you simply need more IO operations to read the entire file. The file size has no effect on the IOPS as it measures the number of clusters read or written, not the number of files.
If you have small files, there will be more overhead, so while the IOPS and throughput look good, you may experience a lower actual performance.
This is the analogy I came up with when talking about Throughput and IOPS.
Think of it as:
You have 4 buckets (Disk blocks) of the same size that you want to fill or empty with water.
You'll be using a jug to transfer the water into the buckets. Now your question will be:
At a given time (per second), how many jugs of water can you pour (write) or withdraw (read)? This is IOPS.
At a given time (per second) what's the amount (bit, kb, mb, etc) of water the jug can transfer into/out of the bucket continuously? This is throughput.
Additionally, there is a delay in the process of you pouring and/or withdrawing the water. This is Latency.
There's 3 things to consider when talking about IOPS and Throughput:
Size (file size/block size)
Patterns (Random/Sequential)
Mix (Read/Write) percentage
The Disk IOPS Describes the count of input/output operations on the disk per seconds, regardless block size.
The disk throughput describes how many data may be transferred per second, so the block size play a huge role upon calculating the throughput required by app
Let's consider as the sample the 3000 IOPS and SQL database engine, the block size in terms of db engine is called the page size and for SQL Server it's equal to 8 KB. If you wish to calculate the actual throughput, if the IOPS defined, you will end up with the formula below:
throughput = [IOPS] * [block size] = 3000 * 8 = 24 000 KB/s = 24 MB/s
IOPS - Number of read write operations mostly useful for OLTP transactions used in AWS for DBs like Cassandra.
Throughput - Is the number of bit transferred per sec. i.e.data transferred per sec.
Mainly a unit for high data transfer applications like big data hadoop,kafka streaming
IOPS- The time taken for a storage system to perform an Input/Output operation per second from start to finish constitutes IOPS.
Throughput- Data transfer speed in megabytes per second is often termed as throughput. Earlier, it was measured in Kilobytes. But now the standard has become megabytes.
More about this see: What is the difference between IOPS and throughput?

How much faster is the memory usually than the disk?

IDE,SCSI,SSD,SATA or all of those.
I'm surprised: Figure 3 in the middle of this article, The Pathologies of Big Data, says that memory is only about 6 times faster when you're doing sequential access (350 Mvalues/sec for memory compared with 58 Mvalues/sec for disk); but it's about 100,000 times faster when you're doing random access.
Random Access Memory (RAM) takes nanoseconds to read from or write to, while hard drive (IDE, SCSI, SATA that I'm aware of) access speed is measured in milliseconds.
2016 Hardware Update: Actual read/write seq throughput
Now the Samsung 940 PRO SSD
reading at 3,500 MB/sec
writing at 2,100 MB/sec
Ram got faster too
reading at 61,000 MB/sec
writing at 48,000 MB/sec..
So now using this metric, RAM looks to be 20x faster than the stuff around when #ChrisW wrote his answer, not 100,000. And, SSDs are 10 times faster than RAM was when he wrote this question.
An important consideration is that we're only measuring memory bandwidth not latency.
It's not precisely about SCSI drives, but I think that the Latency Numbers Every Programmer Should Know table could assist you in understanding the speed and the difference between different latency numbers, including storage options.
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
Read 1 MB sequentially from memory 250,000 ns 250 us
Round trip within same datacenter 500,000 ns 500 us
Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory
Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip
Read 1 MB sequentially from disk 20,000,000 ns 20,000 us 20 ms 80x memory, 20X SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms
Here is a great visual representation that will help you to better understand the scale:
https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
RAM is 100 Thousand Times Faster than Disk for Database Access from
http://www.directionsmag.com/articles/ram-is-100-thousand-times-faster-than-disk-for-database-access/123964
Accessing the RAM is in the order of nanoseconds ( 10e-9 seconds ),
while accessing data on the disk or the network is in the order of
milliseconds (10e-3 seconds).
from Node.JS Design Patterns

Resources