Memory Addressing

Memory Addressing - memory

I was reading http://duartes.org/gustavo/blog/post/motherboard-chipsets-memory-map and in specific, the following section:
In a motherboard the CPU’s gateway to
the world is the front-side bus
connecting it to the northbridge.
Whenever the CPU needs to read or
write memory it does so via this bus.
It uses some pins to transmit the
physical memory address it wants to
write or read, while other pins send
the value to be written or receive the
value being read. An Intel Core 2
QX6600 has 33 pins to transmit the
physical memory address (so there are
2^33 choices of memory locations) and
64 pins to send or receive data (so
data is transmitted in a 64-bit data
path, or 8-byte chunks). This allows
the CPU to physically address 64
gigabytes of memory (2^33 locations *
8 bytes) although most chipsets only
handle up to 8 gigs of RAM.
Now the math above states that since there are 33 pins for addressing, 2^33 * 8 bytes = 64 GB. All good, but now I get a bit confused. Let's say I install a 64 bit OS, I'll be able to address 64 GB total or 2^64Gb * 8 = 2^64GB (which is much more)? Also, assuming I'm using the same cpu above on a 32 bit cpu, I can address only 4 GB still (2^32 bits = 4Gb * 8 = 4GB)?
I think the physical vs "OS Allowable" is getting me confused.
Thanks!

You're confusing a bunch of things:
The size of a pointer limits the amount of virtual memory a user process can access. Not all of these will actually be usable by your process (it is traditional to reserve the "high" 1 or 2 GB for use by the kernel).
Not all virtual address bits are valid. The original AMD64 implementation effectively uses 48-bit sign-extended addresses (i.e. addresses in the range [0x0000800000000000,0xFFFF7FFFFFFFFFFF] are invalid). This exists largely to limit page tables to 4 levels, which decreases the cost of a page fault; you need 6-level page tables to address the full 2^64 bits, assuming 4K pages. For comparison, i386 has 2-level page tables.
Not all virtual addresses need to correspond to physical addresses at any given time. This is the whole point of virtual memory: you can address memory which doesn't "physically" exist, and the OS pages it in for you.
Not all physical addresses correspond to virtual addresses. They might not be mapped, for one, but it's also possible to have more physical memory than you can address. PAE supports up to 64 GB of physical addresses, and was common on servers before AMD64. While an indivial process can't address 64 GB, it means you can run a lot of multi-gigabyte processes without swapping all the time.
And finally: There's no point having more physical addresses than your RAM slots can handle. I have a D945GCLF2 board which supports AMD64, but only 2 GB of RAM. There's no point having extra physical address lines which can't be used anyway. (I'm handwaving over memory-mapped devices and the funky two-DIMMs-one-slot thing which I forget the name of.)
Also, note a few other things:
For memory-mapped I/O (in the hardware sense), the CPU needs to address individual bytes. It can't just do a 64-bit access. This seems to have been glossed over.
Modern processors include the memory controller on the CPU instead using the traditional northbridge and FSB (see HyperTransport and QuickPath).

Yes, number of bits in physical and virtual addresses can be different. Say, here is what 64-bit Linux says about the cores here (cat /proc/cpuinfo):
...
processor : 3
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 280
stepping : 2
cpu MHz : 2392.623
cache size : 1024 KB
...
bogomips : 4784.41
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

There are a few things to consider about the physical address wires:
Each physical address wire ("pin") references a front-side-bus-word, not a byte address. If the CPU fetches 64-bit words, then the physical address wires will be aligned to that 8-byte boundary. Therefore, address lines A0-A2 are not wired because they would always be zero. Thus, the byte address range of the physical wires is increased by the width of the front-side bus.
The virtual memory system can maintain a map of 64-bit virtual addresses to n-bit physical addresses. In practice, the OS maintains a "physical max address" value which the VM mappings do not exceed.
Some memory architectures allow memory bank paging, where off-CPU hardware increases the effective physical memory address range by re-using some physical addresses for different "banks" of memory.

Imagine that in a 64-bit OS some of the wires to address memory don't go anywhere. The OS understands that this is pretty confusing, so it takes the standard 64-bit address and uses virtual memory mapping to make you believe that you're living in a flat 64-bit space.

The chipset limit is a big factor -- the hardware on the motherboard has to be able to pass the addresses from the CPU to the RAM. So the 8GB limit will apply unless you have a motherboard designed to handle more.
For reference, current 64-bit CPUs have the upper x-number-of bits (somewhere between 8 and 24 bits) of the address space wired together, as 64 bits is simply too much address space for now (you'd need 8 billion 2GB modules to take up that much address space). AMDs, for example, have a 48-bit limit (IIRC) on address space in a single segment. Which is more than enough, but nowhere near the theoretical max.

The main difference between a 64-bit and 32-bit OS is that one simply regards the primitive datatype (e.g. a word) as being wider. If the CPU can only physically address 2^33 locations, that won't change just because you're using a 64-bit OS. On the other hand, using a 32-bit OS will generally limit your addressable memory since 32-bit pointers can't represent all the possible values that your CPU could use to address memory (in your example, a 32-bit pointer is one bit short).
Long story short, your addressable memory is limited by both the pointer width (an OS restriction) and the data address bus width (a physical restriction). Some architectures have clever ways of getting around the OS pointer width by using two pointers, one to address a "bank" of memory and another to locally address within the bank. These schemes have sort of fallen out vogue lately, though.
Also, modern OSes generally use a virtual memory subsystem that translates logical addresses into their corresponding physical ones. With caching, the actual physical location of the memory could be in one (or several!) components along a memory heirarchy (e.g. processor cache, main memory, hard disk, etc.) Don't know how I completely forgot to mention VM, but it definitely would help your understanding to investigate it.

I believe that if you have a 64 bit operating system you can (theoretically) address 2^64 * 8 bytes = 16 EB (exabytes), but you will be limited by the hardware to 2^33 * 8 bytes = 64 GB. If you have a 32 bit OS you will not be able to utilize the full hardware capacity since the OS is the limiting factor, only being able to express 2^32 different addresses. I might be off but that's my current understanding.

I think you are getting confused by the fact the memory store 8 bytes at the same time , but an address (at the CPU level) refer to 1 byte (and not a bunch of 8). So with 32 bits you can "refer" to 2^32 bytes = 4GB. If you prefer +8 on pointer correspond to +1 on the number of the "physical" line.
You can then have access to more memory using pagination (not sure if it still used in modern computer).
To do an analogy with a library, you (or the CPU) can enumerate 32^2 books, but the librarian (the chipset) deals with shelves of book. So what is for you book #10, is book #2 or of the shelf #2 but you never see the shelves number. That's the job of the librarian to go to the good shelf and bring you the good book.
For me (another program on the same computer) book #10 could be a different one : book #2 of the shelf 100002 (because my page start at shelf 10000)
We can both refer to 32^2 different book, but they are not the same (and the library can have much more than that).
(Change have changed lots since I studied computer, so what I'm saying can be not 100 % accurate, but I think the idea is there)

Related

is the address bus indicate the size of the address in ram?

I'm sorry for such simple question but this question borders me. so as exp we have the address bus 20 bits.
Does this imply that the address in ram should have 20 bit as size and if the data bus has 16 bits the value in ram are code 16 bit?

The address bus indicates the size of addressable items, including ram, bios, video ram, any I/O mapped devices, ... . The data bus indicates how many bits can be transferred at a time.
20 bits would be 1 MB of address space. However, an external chip to support extended memory could be used to allow more ram. I recall an embedded device (tape drive from the 1980's) that used an 80186, with a custom extended memory interface to support more than 1MB of ram.

On any recent PC-like (micro) computers made the last 45 years or so, the smallest addressable unit in memory have always been the 8-bit byte.
And the the bus sizes doesn't indicate word-size or addressable range. Take for example the venerable Motorola 68000 CPU, which had an external 16-bit data bus and a 24-bit address bus. But internally it was all 32 bits, and instructions could fetch 8-, 16- or 32-bit values from memory, and all of the 32-bit address-space was addressable (even if not all was used).
Later version of the 68k CPU added full external 32-bit busses (with the 68020).

Why is the memory address printed with {:p} much bigger than my RAM specs?

I want to print the memory location (address) of a variable with:
let x = 1;
println!("{:p}", &x);
This prints the hex value 0x7fff51ef6380 which in decimal is 140734568031104.
My computer has 16GB of RAM, so why this huge number? Does the x64 architecture use a big interval sequence instead of just simple 1 increment for accessing memory location?
In x86, usually the first location starts at 0, then 1, 2, etc. so the highest number you can have is around 4 billion, so the address number was always equals or less than 4 billion.
Why is this not the case with x64?

What you see here is an effect of virtual memory. Memory management is hard and it becomes even harder when the operating system and tens of hundreds of processes have to share the memory. In order to handle this huge complexity, the concept of virtual memory was used. I'll just briefly explain the basics here; the topic is far more complex and you should read about it somewhere else, too.
On most modern computers, each process thinks that it owns (almost) the complete memory space. But processes never deal with physical addresses, but with virtual ones. These virtual addresses are mapped to physical ones each time the process actually reads from memory. This translation of addresses is done by the so called MMU (memory management unit). The rules for how to map the addresses are setup by the operating system.
When you boot your PC, the operating system creates an initial mapping. Every time you start a process, the operating system adds a few slices of physical memory to the process and modifies the mapping appropriately. That way, the process has memory to play with.
On x86_64, the address space is 64 bit wide, so each process thinks it owns all of those 2^64 addresses. This is not true, of course:
There isn't a single PC on the world with that much memory. (In fact, most CPUs today can merely use 280 TB of RAM, since they internally can only use 48bit for addressing physical memory. And even these 280TB are enough for now, apparently.)
Even if you had that much memory, there are other processes which use part of that memory, too.
So what happens when you try to read an address which isn't mapped (which in 64bit land, are the vast majority of the addresses)? The MMU triggers a page fault. This makes the CPU notify the operating system to handle this.
What I mean is that in x86, usually first location starts at 0, then 1, 2, etc. so the highest number you can have is around 4 billion.
That is true, but it is also true if your x86 system has less than 4GB of RAM. Virtual memory exists for quite some time already.
So that's a short summary of why you see such big addresses. Again, please note that I glossed over many details here.

The pointers your program works with are in virtual address space. x86-64 uses 64-bit pointers. This was one of the major goals of AMD64, along with adding more integer and XMM registers. You are correct that i386 only has 32-bit pointers which only cover 4GB of address space in each process.
0x7fff51ef6380 looks like a stack pointer, which I guess makes sense for that code.
Linux on x86-64 (for example) puts the stack near the top of the lower canonical address range: current x86-64 hardware only implements 48-bit virtual addresses and this is the mechanism to prevent software from depending on it. This allows the address space to be extended in the future without breaking software.
The amount of phyiscal RAM in your system has nothing to do with this. You'd see (approximately) the same number on an x86-64 system with 128MB of RAM, +/- stack address space layout randomization (ASLR).

Does Word length == number of bits transferred between memory and CPU per access?

I am really confused about the concept of the "word length".
I know that in 32-bit machine, the memory address has 32 bits. And each memory access transfers 32 bits (4 bytes) to the CPU.
In 64-bit machine, the address has 64 bits. But does it mean the memory access unit is also 64 bits?
In this answer, the author says "Word: The natural size with which a processor is handling data (the register size)". But it does not explicitly specifies how many bits are transferred between memory and CPU per memory access.

In a CPU with a cache, data usually only transfers between CPU and memory a whole cache-line at a time. e.g. on a modern x86, a 1B load that hits in cache would not produce any external memory access.
If it missed even in the last-level cache, the memory chips would see a request for the 64B aligned block containing that byte.
Modern x86 CPUs have 16B or even 32B (256b) data paths between cache and execution units.
See also other links in the x86 tag wiki to learn more.

It does not say that because it does not mean that. Addresses are often 64-bits on a 64-bit machine, but not always. Data paths are often 64-bits on a 64-bit machine, but not always.

32-bit PC, size of pointer

For a 4G ram, there is 4 * 1024 * 1024 * 1024 * 8 = 2^(32+3) bits. My question is how could a 32-bit PC can access a 4G memory. What I can think of this is "a byte is the storage unit, one can not store a data in a bit". Is this correct?
Another question is: in such PC, does a pointer always have size 32 bit? It seems reasonable for me, because we have 2^32 storage units to store the data. But in this answer and the next with their remarks, this is said to be wrong. If it is wrong, why?

Individual bits are accessed by reading the address of the byte containing it, modifying the byte and writing back if necessary.
In some architectures the smallest addressable unit is double word, in which case no single byte can be accessed "as is". Theoretically one could design an architecture that would address 16 GB of memory with 32-bits of unique addresses. And similar things happened years ago, when the addressable units of a Hard Drive were limited to bare 2^28 units of 512 byte sectors or so.
It's not completely wrong to say that PC's have 32-bit pointers. That's just a bit old information, as the newer models are internally 64-bit systems and can access depending on the OS up to 2^48 bytes of memory. Currently most existing PCs are 32-bit and nothing can be done about it.
Well, StuartLC remainded about paging. Even in the current 32-bit systems, one can use 48-bits of addressing using old age segment registers. (Can't remember if there was a restriction of segment registers low three bits being zero...) But anyway that would allow 2^45 bytes of individual addresses, out of which just a small fraction could ever be in the main memory simultaneously. If an OS supporting that addressing mode was developed, then probably full 64 bits would be allocated for the pointer. Just like it is today with 64-bit processors.

My question is how could a 32-bit PC can access a 4G memory
You may be confusing address bus (addressable memory) and the size of the processor registers. This superuser post details the differences.
Paging is a technique commonly used to allow memory to be addressed beyond the size of the OS's capabilities, e.g. see PAE
does a pointer always have size 32 bit
No, not necessarily - e.g. on 16 bit DOS and Windows, and also pointers could be relative to a segment.
Can one can not store a data in a bit?
Yes, you can, e.g. in C, bit packing in structs can be done, albeit at the cost of performance and portability.
Today performance is more important, and compilers will typically try and align data to its machine word size, for performance reasons.

How can a extend memory space at 8086 up to 1 GB?

How can a extend memory space at 8086 up to 1 GB ???

Obviously, you're not going to get a linear address space. 1GB of space requires 30 address lines, and there are only 20 physical address lines on the 8086. You implement bank switching, where the 8086 provides 20 lower address lines. The 10 additional lines are provided via a latch that you map to a 16-bit I/O port. Writing a value to that port stores the 10-bit bank number in the latch. The latch is then used to feed the upper 10 address lines to memory.
When I did this as hardware project at university 20 years ago, the largest memory we could get hold of then was 2MB - I've no idea how you would interface a modern 1GB memory module!

You'd have to implement some kind of bank switching in hardware.

You could upgrade to a more modern processor. For example, any processor that's not from the seventies!
If that's out of the question, this probably becomes more of a hardware problem than a software problem...

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart