How could I estimate my memory device access speed? - memory

My memory is DDR2 800MHz
dmidecode -t memory
Handle 0x1100, DMI type 17, 27 bytes
Memory Device
Array Handle: 0x1000
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 1024 MB
Form Factor: DIMM
Set: None
Locator: DIMM_1
Bank Locator: Not Specified
Type: DDR2
Type Detail: Synchronous
Speed: 800 MHz (1.2 ns)
Manufacturer: 7F98000000000000
Serial Number: 2DCCDD00
Asset Tag: 050916
Part Number:
I want to estimate the write or read speed by this info.
Is 800MHz means read/write 800bits per second?
Should we multiple 64 bit datawidth? eg: 800*64 bits/sec. So we can read/write 800*64/8 bytes/sec?
Thanks advance!

Related

Kilobytes or Kibibytes in GNU time?

We are doing some performance measurements including some memory footprint measurements. We've been doing this with GNU time.
But, I cannot tell if they are measuring in kilobytes (1000 bytes) or kibibytes (1024 bytes).
The man page for my system says of the %M format key (which we are using to measure peak memory usage): "Maximum resident set size of the process during its lifetime, in Kbytes."
I assume K here means the SI "Kilo" prefix, and thus kilobytes.
But having looked at a few other memory measurements of various things through various tools, I trust that assumption like I'd trust a starved lion to watch my dogs during a week-long vacation.
I need to know, because for our tests 1000 vs 1024 Kbytes adds up to a difference of nearly 8 gigabytes, and I'd like to think I can cut down the potential error in our measurements by a few billion.
Using the below testing setup, I have determined that GNU time on my system measures in Kibibytes.
The below program (allocator.c) allocates data and touches each of it 1 KiB at a time to ensure that it all gets paged in. Note: This test only works if you can page in the entirety of the allocated data, otherwise time's measurement will only be the largest resident collection of memory.
allocator.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define min(a,b) ( ( (a)>(b) )? (b) : (a) )
volatile char access;
volatile char* data;
const int step = 128;
int main(int argc, char** argv ){
unsigned long k = strtoul( argv[1], NULL, 10 );
if( k >= 0 ){
printf( "Allocating %lu (%s) bytes\n", k, argv[1] );
data = (char*) malloc( k );
for( int i = 0; i < k; i += step ){
data[min(i,k-1)] = (char) i;
}
free( data );
} else {
printf("Bad size: %s => %lu\n", argv[1], k );
}
return 0;
}
compile with: gcc -O3 allocator.c -o allocator
Runner Bash Script:
kibibyte=1024
kilobyte=1000
mebibyte=$(expr 1024 \* ${kibibyte})
megabyte=$(expr 1000 \* ${kilobyte})
gibibyte=$(expr 1024 \* ${mebibyte})
gigabyte=$(expr 1000 \* ${megabyte})
for mult in $(seq 1 3);
do
bytes=$(expr ${gibibyte} \* ${mult} )
echo ${mult} GiB \(${bytes} bytes\)
echo "... in kibibytes: $(expr ${bytes} / ${kibibyte})"
echo "... in kilobytes: $(expr ${bytes} / ${kilobyte})"
/usr/bin/time -v ./allocator ${bytes}
echo "===================================================="
done
For me this produces the following output:
1 GiB (1073741824 bytes)
... in kibibytes: 1048576
... in kilobytes: 1073741
Allocating 1073741824 (1073741824) bytes
Command being timed: "./a.out 1073741824"
User time (seconds): 0.12
System time (seconds): 0.52
Percent of CPU this job got: 75%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.86
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1049068
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 262309
Voluntary context switches: 7
Involuntary context switches: 2
Swaps: 0
File system inputs: 16
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
====================================================
2 GiB (2147483648 bytes)
... in kibibytes: 2097152
... in kilobytes: 2147483
Allocating 2147483648 (2147483648) bytes
Command being timed: "./a.out 2147483648"
User time (seconds): 0.21
System time (seconds): 1.09
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.31
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2097644
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 524453
Voluntary context switches: 4
Involuntary context switches: 3
Swaps: 0
File system inputs: 0
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
====================================================
3 GiB (3221225472 bytes)
... in kibibytes: 3145728
... in kilobytes: 3221225
Allocating 3221225472 (3221225472) bytes
Command being timed: "./a.out 3221225472"
User time (seconds): 0.38
System time (seconds): 1.60
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.98
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 3146220
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 786597
Voluntary context switches: 4
Involuntary context switches: 3
Swaps: 0
File system inputs: 0
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
====================================================
In the "Maximum resident set size" entry, I see values that are closest to the kibibytes value I expect from that raw byte count. There is some difference because its possible that some memory is being paged out (in cases where it is lower, which none of them are here) and because there is more memory being consumed than what the program allocates (namely, the stack and the actual binary image itself).
Versions on my system:
> gcc --version
gcc (GCC) 6.1.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> /usr/bin/time --version
GNU time 1.7
> lsb_release -a
LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: CentOS
Description: CentOS release 6.10 (Final)
Release: 6.10
Codename: Final

Getting memory stored in address space

"If a computer handles data in 8-bit sizes and uses a 16-bit address to store and retrieve data in memory, its address space contains 2^16 (65536) bytes or 64k bytes"
My text book has this statement that I'm confused by. Where are they getting 2^16 from? If a computer uses a 16-bit address why isn't that just a 2 byte address space? The textbook hasn't explained how memory is stored in microcomputers and has this statement in the intro chapter. Am I missing something?
If an address is 16 bits, it means that you have 16 bits when referring to a location in memory. The address space is the range of valid addresses, not the physical size of an address.
These addresses start at address 0 (binary 0000 0000 0000 0000) and go up to address 216−1 (binary 1111 1111 1111 1111). That's total of 216 addresses that can be referenced. And if each address refers to 8 bits (i.e. a byte), the total amount of memory that you can refer to with those addresses is 216 × 8 bits, or 216 bytes.
As a smaller example, consider a system with 3-bit addresses, each referring to 4 bits (a nibble).
Address | 0 1 2 3 4 5 6 7
Memory | 0000 0000 0000 0000 0000 0000 0000 0000
Binary |
address | 000 001 010 011 100 101 110 111
The 3-bit addresses can have 23 values, from 0 to 7, and each refers to 4 bits of memory, so this system has a total of 23 = 8 nibbles of memory.
In this system, the only valid address are 0, 1, 2, 3, 4, 5, 6, and 7, so the address space is the set {0, 1, 2, 3, 4, 5, 6, 7}.
As an important point, don't forget that the address space is not necessarily the actual amount of memory available—computers use some handy tricks to use addresses spaces much larger than the memory they actual have available (for example, a 64-bit system can theoretically address 264 bytes of memory, but you don't even have a fraction of that in your computer).
Analogies for address spaces
Here are two analogies that might help you understand the difference between an address space, an address, and a pointer:
The web address space is the set of all URLs, basically the set of strings of the form https://[domain]/[path]. So https://example.com/page is an address, and this link is a pointer to that address.
The United States street address space is (approximately) the set of strings of this form:
[First name] [Last name]
[number] [Street name]
[Town], [STATE] [zip code]
In the same analogy, this is an address:
John Doe
10 Main St.
Faketown, NY 20164
Finally, a pointer is then analagous to the writing on the front of an envelope that the postal service uses to deliver letters.

Test Plan with ApacheBench(AB) testing tool

I am trying load testing here. My backend is in Ruby(2.2) on Rails(3).
I read many pages about how to work with Ab testing.
Here is what I have tried:
ab -n 100 -c 30 url
Result:
This is ApacheBench, Version 2.3 <$Revision: 1554214 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 52.74.130.35 (be patient).....done
Server Software: nginx/1.6.2
Server Hostname: 52.74.130.35
Server Port: 80
Document Path: url
Document Length: 1372 bytes
Concurrency Level: 3
Time taken for tests: 10.032 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 181600 bytes
HTML transferred: 137200 bytes
Requests per second: 9.97 [#/sec] (mean)
Time per request: 300.963 [ms] (mean)
Time per request: 100.321 [ms] (mean, across all concurrent requests)
Transfer rate: 17.68 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 9 25.0 5 227
Processing: 176 289 136.5 257 1134
Waiting: 175 275 77.9 256 600
Total: 180 298 139.2 264 1143
Percentage of the requests served within a certain time (ms)
50% 264
66% 285
75% 293
80% 312
90% 361
95% 587
98% 1043
99% 1143
Which seams to be working perfectly. But my problem is I want to test many API's, not just one. So I have to write a script in which I write all the Api's with particular probabilities(weights) and load test on them.
I know how its possible with Locust, but locust does not support nested json to be passed as parameters.
Can somebody help with this.
Also let me know if there is any problem/ambiguity in the question itself.

What are the Module specific heaps reported by sos.eeheap?

When I run !eeheap -loader SOS command in WinDbg against a memory dump of any .NET process, it outputs two strange groups of of heaps after the domains and JIT code heap.
Here is the output:
0:000> !eeheap -loader
Loader Heap:
--------------------------------------
...........
--------------------------------------
Module Thunk heaps:
Module 000007fecb601000: Size: 0x0 (0) bytes.
Module 000007fee8bc1000: Size: 0x0 (0) bytes.
...........
Total size: Size: 0x0 (0) bytes.
--------------------------------------
Module Lookup Table heaps:
Module 000007fecb601000: Size: 0x0 (0) bytes.
Module 000007fee8bc1000: Size: 0x0 (0) bytes.
...........
Total size: Size: 0x0 (0) bytes.
--------------------------------------
Total LoaderHeap size: Size: 0xd55000 (13979648) bytes total, 0xb0000 (720896) bytes wasted.
=======================================
What are the "Module Thunk heaps" and "Module Lookup Table heaps"? And why are they always zero size?
The only thing I know is that the both of these heaps contain references to every loaded module.

Is this a good way to demonstrate Nodejs (expressjs) advantage over Rails/Django/etc?

UPDATE
This was not supposed to be a benchmark, or a node vs ruby thing (I should left that more clear in the question, sorry). The point was to compare and demonstrate the diference between blocking and non blocking and how easy it is to write non blocking. I could compare using EventMachine for exemple, but node has this builtin, so it was the obvious choice.
I'm trying to demonstrate to some friends the advantage of nodejs (and it's frameworks) over other technologies, some way that is very simple to understand mainly the non blocking IO thing.
So I tried creating a (very little) Expressjs app and a Rails one that would do a HTTP request on google and count the resulting html length.
As expected (on my computer) Expressjs was 10 times faster than Rails through ab (see below). My questioon is if that is a "valid" way to demonstrate the main advantage that nodejs provides over other technologies. (or there is some kind of caching going on in Expressjs/Connect?)
Here is the code I used.
Expressjs
exports.index = function(req, res) {
var http = require('http')
var options = { host: 'www.google.com', port: 80, method: 'GET' }
var html = ''
var googleReq = http.request(options, function(googleRes) {
googleRes.on('data', function(chunk) {
html += chunk
})
googleRes.on('end', function() {
res.render('index', { title: 'Express', html: html })
})
});
googleReq.end();
};
Rails
require 'net/http'
class WelcomeController < ApplicationController
def index
#html = Net::HTTP.get(URI("http://www.google.com"))
render layout: false
end
end
This is the AB benchmark results
Expressjs
Server Software:
Server Hostname: localhost
Server Port: 3000
Document Path: /
Document Length: 244 bytes
Concurrency Level: 20
Time taken for tests: 1.718 seconds
Complete requests: 50
Failed requests: 0
Write errors: 0
Total transferred: 25992 bytes
HTML transferred: 12200 bytes
Requests per second: 29.10 [#/sec] (mean)
Time per request: 687.315 [ms] (mean)
Time per request: 34.366 [ms] (mean, across all concurrent requests)
Transfer rate: 14.77 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 0
Processing: 319 581 110.6 598 799
Waiting: 319 581 110.6 598 799
Total: 319 581 110.6 598 799
Percentage of the requests served within a certain time (ms)
50% 598
66% 608
75% 622
80% 625
90% 762
95% 778
98% 799
99% 799
100% 799 (longest request)
Rails
Server Software: WEBrick/1.3.1
Server Hostname: localhost
Server Port: 3001
Document Path: /
Document Length: 65 bytes
Concurrency Level: 20
Time taken for tests: 17.615 seconds
Complete requests: 50
Failed requests: 0
Write errors: 0
Total transferred: 21850 bytes
HTML transferred: 3250 bytes
Requests per second: 2.84 [#/sec] (mean)
Time per request: 7046.166 [ms] (mean)
Time per request: 352.308 [ms] (mean, across all concurrent requests)
Transfer rate: 1.21 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 180 387.8 0 999
Processing: 344 5161 2055.9 6380 7983
Waiting: 344 5160 2056.0 6380 7982
Total: 345 5341 2069.2 6386 7983
Percentage of the requests served within a certain time (ms)
50% 6386
66% 6399
75% 6402
80% 6408
90% 7710
95% 7766
98% 7983
99% 7983
100% 7983 (longest request)
To complement Sean's answer:
Benchmarks are useless. They show what you want to see. They don't show the real picture. If all your app does is proxy requests to google, then an evented server is a good choice indeed (node.js or EventMachine-based server). But often you want to do something more than that. And this is where Rails is better. Gems for every possible need, familiar sequential code (as opposed to callback spaghetti), rich tooling, I can go on.
When choosing one technology over another, assess all aspects, not just how fast it can proxy requests (unless, again, you're building a proxy server).
You're using Webrick to do the test. Off the bat the results are invalid because Webrick can only process on request at a time. You should use something like thin, which is built on top of eventmachine, which can process multiple requests at a time. Your time per request across all concurrent requests, transfer rate and connection times will improve dramatically making that change.
You should also keep in mind that request time is going to be different between each run because of network latency to Google. You should look at the numbers several times to get an average that you can compare.
In the end, you're probably not going to see a huge difference between Node and Rails in the benchmarks.

Resources