32-bit PC, size of pointer - memory

For a 4G ram, there is 4 * 1024 * 1024 * 1024 * 8 = 2^(32+3) bits. My question is how could a 32-bit PC can access a 4G memory. What I can think of this is "a byte is the storage unit, one can not store a data in a bit". Is this correct?
Another question is: in such PC, does a pointer always have size 32 bit? It seems reasonable for me, because we have 2^32 storage units to store the data. But in this answer and the next with their remarks, this is said to be wrong. If it is wrong, why?

Individual bits are accessed by reading the address of the byte containing it, modifying the byte and writing back if necessary.
In some architectures the smallest addressable unit is double word, in which case no single byte can be accessed "as is". Theoretically one could design an architecture that would address 16 GB of memory with 32-bits of unique addresses. And similar things happened years ago, when the addressable units of a Hard Drive were limited to bare 2^28 units of 512 byte sectors or so.
It's not completely wrong to say that PC's have 32-bit pointers. That's just a bit old information, as the newer models are internally 64-bit systems and can access depending on the OS up to 2^48 bytes of memory. Currently most existing PCs are 32-bit and nothing can be done about it.
Well, StuartLC remainded about paging. Even in the current 32-bit systems, one can use 48-bits of addressing using old age segment registers. (Can't remember if there was a restriction of segment registers low three bits being zero...) But anyway that would allow 2^45 bytes of individual addresses, out of which just a small fraction could ever be in the main memory simultaneously. If an OS supporting that addressing mode was developed, then probably full 64 bits would be allocated for the pointer. Just like it is today with 64-bit processors.

My question is how could a 32-bit PC can access a 4G memory
You may be confusing address bus (addressable memory) and the size of the processor registers. This superuser post details the differences.
Paging is a technique commonly used to allow memory to be addressed beyond the size of the OS's capabilities, e.g. see PAE
does a pointer always have size 32 bit
No, not necessarily - e.g. on 16 bit DOS and Windows, and also pointers could be relative to a segment.
Can one can not store a data in a bit?
Yes, you can, e.g. in C, bit packing in structs can be done, albeit at the cost of performance and portability.
Today performance is more important, and compilers will typically try and align data to its machine word size, for performance reasons.

Related

how ram and rom size is depend on cpu? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am very interested to know how CPU is working. let say in 8-bit microcontroller(8051) how ram and rom in depends on cpu? according to these topics, I have some question in my mind which is confusing me. like
1 = how to ram and rom size is defined(in 8051 microcontroller)?
2 = what means of 8-bit controller?
3 = is rom size is depend on CPU size? if not so how much rom I interface with 8-bit controller?
I searched more regarding to this my questions but not found any solutions so please help me
and is there any have some document and books(microcontroller) so please suggest me
Thanks,
Not much different than the above answer...
In all of this there are no definitive definitions, they are often a slang or engineer speak or marketing speak for things. 8-bit is a little bit more firm, with exceptions. It implies that the processor operations or maximum size of the bulk of the operations is 8 bits wide, so an 8 bit wide alu if you will. Some folks try to make the register size define the bit size, the instruction size, some the number of address bits on the cpu core, etc. So is an x86 an 8-bit, 16-bit, 32-bit, 64-bit, 128, 256, 512 or 1024 based on above notions? could be any depending on who you ask...
The 8051 is considered 8-bit based on the time frame and that most things in it are 8 bit in size.
The 8051 has been so heavily cloned and as mentioned banking is sometimes used to expand the memory space, so it depends on the specific cpu/part/core you are using as to how much total it can access. ROM/RAM sizes are also specific to the part you are using, you start with the datasheet from the part vendor and then as needed other documentation. The part/IP vendor is the definitive source for RAM/ROM information for the 8051 variant you are using at any particular time.
Microcontrollers in general not just 8051s will tend to have more ROM/FLASH than RAM, it should be obvious why when you start writing applications and see that you need more of one than the other.
As answered by Guna the maximum addressing space is determined by the number of address bits on "the bus", but as mentioned above that can/will vary by implementation, there are some that can address a megabyte some that can only address some number of K bytes.
Some CPU architectures are more controlled than others, either by documentation and versions or by ownership and control of the IP (no clones that survive lawsuits for example). So some will have a fixed address space size and currently have no exceptions, but then there are those like the 8051 that have been cloned so heavily (8051s are still widely in use, there is a good chance your computer has at least one, if not the servers along the internet and websites like this definitely will) both their original clocking scheme and address space options vary from implementation to implementation. So this is not a case of the CPU name/type/brand determines maximum amount of ram/rom, and it almost never will determine the exact amount of each you have in a specific implementation, a specific chip or board.
It is very easy to find 8051 information, countless websites, more than there is space to provide links. Start with some chip vendors still actively producing 8051 chips. Silicon labs, microchip, cypress, and perhaps others.
For example it took only a few seconds to find a datasheet for a specific part that states:
512 bytes RAM
8 kB (F990/1/6/7, F980/1/6/7), 4 kB (F982/3/8/9), or 2 kB (F985) Flash; in-system programmable
The price of the part is heavily influenced by the ROM/FLASH size and the RAM size, so a particular family of parts will essentially have the same design with different sized memories depending on your needs, if you can keep the program smaller you can buy a part that is say a dollar less than another in the family but may have the same footprint so that design for the larger one and switch to the smaller one or vice versa hope for the smaller one and if your program is too big then have to switch to the bigger one and deal with the profit loss.
Please find below answers for your questions as per my knowledge.
1) The 8051 microcontroller's memory is divided into Program Memory and Data Memory. Program Memory (ROM) is used for permanent saving program being executed, while Data Memory (RAM) is used for temporarily storing and keeping intermediate results and variables.
2) an 8-bit microcontroller processes 8-bits of data at any particular time. The number of bits used by an MCU (sometimes called bit depth or data width) tells you the size of the registers (8 bits per register), number of memory addresses (only 2^8 = 256 addresses), and the largest numbers they can process (again, 2^8 = 256 integers, or integers 0 through 255). An 8-bit microcontroller has limited addressing, but some 8-bit microcontrollers use paging, where the contents of a page register determines which onboard memory bank to use.
3) Yes, The maximum rom size can be addressed by CPU depending of the width of address bus. for example in 8085 microprocessor the width of the address bus is 16bit so it can address upto 2^16 = 65536 (8 Bit values).

Does Word length == number of bits transferred between memory and CPU per access?

I am really confused about the concept of the "word length".
I know that in 32-bit machine, the memory address has 32 bits. And each memory access transfers 32 bits (4 bytes) to the CPU.
In 64-bit machine, the address has 64 bits. But does it mean the memory access unit is also 64 bits?
In this answer, the author says "Word: The natural size with which a processor is handling data (the register size)". But it does not explicitly specifies how many bits are transferred between memory and CPU per memory access.
In a CPU with a cache, data usually only transfers between CPU and memory a whole cache-line at a time. e.g. on a modern x86, a 1B load that hits in cache would not produce any external memory access.
If it missed even in the last-level cache, the memory chips would see a request for the 64B aligned block containing that byte.
Modern x86 CPUs have 16B or even 32B (256b) data paths between cache and execution units.
See also other links in the x86 tag wiki to learn more.
It does not say that because it does not mean that. Addresses are often 64-bits on a 64-bit machine, but not always. Data paths are often 64-bits on a 64-bit machine, but not always.

Determine how many memory slots are available in a specific computer?

Im studying assembly langauge on my own with a textbook, and I have a question that talks about the memory of a computer. It says the possible memory in a 32-bit PC is 4,294,967,296, which is 4GB. This is because the last memory location is FFFFFFFF base 16 (8 F's there). It also goes on to say that 2^10 is 1KB, 2^30 is 1GB etc. It also addresses 64-bit machines, saying 64 bit mode can internally store 64-bit addresses and "that at the time this book was written, processors use at most 48 bits of the possible 64". It goes on to say that this limitation is no match, because it could address up to 2^48 bytes of physical memory (256TB) which is 65,536 times the maximum in 32-bit systems. It also finally talks about RAM and how it basically provides an extension of memory. Okay okay, so I just wanted to tell you what my book has been telling me, and so it possesses a questions:
Suppose that you buy a 64-bit PC with 2 GB of RAM. What is the 16-hex-digit of the "last" byte of installed memory?
And I tried to tackle it by saying we know from the boook that 2^30 = 1GB and I said, 2^x = 2GB. I then knew that one physical address is one byte, so I converted 2GB to the respective amount of bytes. I then I took the log of base 2 of how many bytes I got to solve for x. I got 2^31 in the end, but that was a lot of work. I then converted it to hex giving me 80000000 base 16. And I was stumped then. I look at the answer in the back of the book and it says this:
2 * 3^20 = 2^31 = 80000000 base 16, so the last address is 000000007FFFFFFF.
how did the book get 3^20? and that doesnt even equal 2^31 when you times it all out by 2. How do you solve this problem.
In addition how does RAM correspond to memory, is it an extension of the physical memory? the book doesnt actually say that, just says its wiped from the computer every time the computer shuts off, etc. Could you give me more insight on this?
Thanks,
-Dan

Is there merit to having less-than-8-byte pointers on 64-bit systems?

We know that in 64bit computers pointers will be 8bytes,
that will allow us to address a huge memory.
But on the other hand, memories that are available to usual people
now are up to 16G, that means that at the moment we do not need 8 bytes for
addressig, but 5 or at most 6 bytes.
I am a Delphi user.
The question (probably for developers of 64 bit compiler) is:
Would it be possible to declare somewhere how many bytes you would like to
use for pointers, and that will be valid for the whole application.
In case that you have application with millions of pointers and you will
be able to declare that pointers are only 5 bytes, the amount of memory
that will be occupied will be much lower.
I can imagine that this could be difficult to implement,
but I am curious anyway about it.
Thanks in advance.
A million 64-bit pointers will occupy less than eight megabytes. That's nothing. A typical modern computer has 6 GB of RAM. Hence, 8 MB is only slightly more than 1 permille of the total amount of RAM.
There are other uses for the excess precision of 8-byte pointers: you can, for example, encode the class of a reference (as an ordinal index) into the pointer itself, stealing 10 or 20 bits from the 64 available, leaving more than enough for currently available systems.
This can let the compiler writer do inline caching of virtual methods without the cost of an indirection when confirming that the instance is of the expected type.
Actually, it wouldn't save memory. Memory allocations have to be aligned based on the size of what you're allocating. E.g., a 4 byte section of memory has to be placed at a multiple of 4. So, due to the padding to align your 5-byte pointers, they'd actually consume the same amount of memory.
Remember actual OSes don't let you use physical addresses. User processes always use virtual addresses (usually only the kernel can access physical addresses). The processor will transparently turn virtual addresses into physical addresses. That means you can find your program uses pointers to virtual addresses large enough that they don't have a real address counterpart for a given system. It always happened in 32 bit Windows, where DLLs are mapped in the upper 2GB (virtual process address space, always 4GB), even when the machine has far less than 2GB of memory (actually it started to happen when PC had only a few megabytes - it doesn't matter).
Thereby using "small" pointers is a nonsense (even ignoring all the other factors, i.e. memory access, register sizes, instructions standard operand sizez, etc.) which would only reduce the virtual address space available. Also techniques like memory mapped files needs "large" pointers to access a file which could be far larger than the available memory.
Another use for some excess pointer space would be for storing certain value types without boxing. I'm not sure one would want a general-purpose mechanism for small value types, but certainly it would be reasonable to encode all 32-bit signed and unsigned integers, as well as all single-precision floats, and probably many values of type 'long' and 'unsigned long' (e.g. all those that would could be precisely represented by an int, unsigned int, or float).

Memory Addressing

I was reading http://duartes.org/gustavo/blog/post/motherboard-chipsets-memory-map and in specific, the following section:
In a motherboard the CPU’s gateway to
the world is the front-side bus
connecting it to the northbridge.
Whenever the CPU needs to read or
write memory it does so via this bus.
It uses some pins to transmit the
physical memory address it wants to
write or read, while other pins send
the value to be written or receive the
value being read. An Intel Core 2
QX6600 has 33 pins to transmit the
physical memory address (so there are
2^33 choices of memory locations) and
64 pins to send or receive data (so
data is transmitted in a 64-bit data
path, or 8-byte chunks). This allows
the CPU to physically address 64
gigabytes of memory (2^33 locations *
8 bytes) although most chipsets only
handle up to 8 gigs of RAM.
Now the math above states that since there are 33 pins for addressing, 2^33 * 8 bytes = 64 GB. All good, but now I get a bit confused. Let's say I install a 64 bit OS, I'll be able to address 64 GB total or 2^64Gb * 8 = 2^64GB (which is much more)? Also, assuming I'm using the same cpu above on a 32 bit cpu, I can address only 4 GB still (2^32 bits = 4Gb * 8 = 4GB)?
I think the physical vs "OS Allowable" is getting me confused.
Thanks!
You're confusing a bunch of things:
The size of a pointer limits the amount of virtual memory a user process can access. Not all of these will actually be usable by your process (it is traditional to reserve the "high" 1 or 2 GB for use by the kernel).
Not all virtual address bits are valid. The original AMD64 implementation effectively uses 48-bit sign-extended addresses (i.e. addresses in the range [0x0000800000000000,0xFFFF7FFFFFFFFFFF] are invalid). This exists largely to limit page tables to 4 levels, which decreases the cost of a page fault; you need 6-level page tables to address the full 2^64 bits, assuming 4K pages. For comparison, i386 has 2-level page tables.
Not all virtual addresses need to correspond to physical addresses at any given time. This is the whole point of virtual memory: you can address memory which doesn't "physically" exist, and the OS pages it in for you.
Not all physical addresses correspond to virtual addresses. They might not be mapped, for one, but it's also possible to have more physical memory than you can address. PAE supports up to 64 GB of physical addresses, and was common on servers before AMD64. While an indivial process can't address 64 GB, it means you can run a lot of multi-gigabyte processes without swapping all the time.
And finally: There's no point having more physical addresses than your RAM slots can handle. I have a D945GCLF2 board which supports AMD64, but only 2 GB of RAM. There's no point having extra physical address lines which can't be used anyway. (I'm handwaving over memory-mapped devices and the funky two-DIMMs-one-slot thing which I forget the name of.)
Also, note a few other things:
For memory-mapped I/O (in the hardware sense), the CPU needs to address individual bytes. It can't just do a 64-bit access. This seems to have been glossed over.
Modern processors include the memory controller on the CPU instead using the traditional northbridge and FSB (see HyperTransport and QuickPath).
Yes, number of bits in physical and virtual addresses can be different. Say, here is what 64-bit Linux says about the cores here (cat /proc/cpuinfo):
...
processor : 3
vendor_id : AuthenticAMD
cpu family : 15
model : 33
model name : Dual Core AMD Opteron(tm) Processor 280
stepping : 2
cpu MHz : 2392.623
cache size : 1024 KB
...
bogomips : 4784.41
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
There are a few things to consider about the physical address wires:
Each physical address wire ("pin") references a front-side-bus-word, not a byte address. If the CPU fetches 64-bit words, then the physical address wires will be aligned to that 8-byte boundary. Therefore, address lines A0-A2 are not wired because they would always be zero. Thus, the byte address range of the physical wires is increased by the width of the front-side bus.
The virtual memory system can maintain a map of 64-bit virtual addresses to n-bit physical addresses. In practice, the OS maintains a "physical max address" value which the VM mappings do not exceed.
Some memory architectures allow memory bank paging, where off-CPU hardware increases the effective physical memory address range by re-using some physical addresses for different "banks" of memory.
Imagine that in a 64-bit OS some of the wires to address memory don't go anywhere. The OS understands that this is pretty confusing, so it takes the standard 64-bit address and uses virtual memory mapping to make you believe that you're living in a flat 64-bit space.
The chipset limit is a big factor -- the hardware on the motherboard has to be able to pass the addresses from the CPU to the RAM. So the 8GB limit will apply unless you have a motherboard designed to handle more.
For reference, current 64-bit CPUs have the upper x-number-of bits (somewhere between 8 and 24 bits) of the address space wired together, as 64 bits is simply too much address space for now (you'd need 8 billion 2GB modules to take up that much address space). AMDs, for example, have a 48-bit limit (IIRC) on address space in a single segment. Which is more than enough, but nowhere near the theoretical max.
The main difference between a 64-bit and 32-bit OS is that one simply regards the primitive datatype (e.g. a word) as being wider. If the CPU can only physically address 2^33 locations, that won't change just because you're using a 64-bit OS. On the other hand, using a 32-bit OS will generally limit your addressable memory since 32-bit pointers can't represent all the possible values that your CPU could use to address memory (in your example, a 32-bit pointer is one bit short).
Long story short, your addressable memory is limited by both the pointer width (an OS restriction) and the data address bus width (a physical restriction). Some architectures have clever ways of getting around the OS pointer width by using two pointers, one to address a "bank" of memory and another to locally address within the bank. These schemes have sort of fallen out vogue lately, though.
Also, modern OSes generally use a virtual memory subsystem that translates logical addresses into their corresponding physical ones. With caching, the actual physical location of the memory could be in one (or several!) components along a memory heirarchy (e.g. processor cache, main memory, hard disk, etc.) Don't know how I completely forgot to mention VM, but it definitely would help your understanding to investigate it.
I believe that if you have a 64 bit operating system you can (theoretically) address 2^64 * 8 bytes = 16 EB (exabytes), but you will be limited by the hardware to 2^33 * 8 bytes = 64 GB. If you have a 32 bit OS you will not be able to utilize the full hardware capacity since the OS is the limiting factor, only being able to express 2^32 different addresses. I might be off but that's my current understanding.
I think you are getting confused by the fact the memory store 8 bytes at the same time , but an address (at the CPU level) refer to 1 byte (and not a bunch of 8). So with 32 bits you can "refer" to 2^32 bytes = 4GB. If you prefer +8 on pointer correspond to +1 on the number of the "physical" line.
You can then have access to more memory using pagination (not sure if it still used in modern computer).
To do an analogy with a library, you (or the CPU) can enumerate 32^2 books, but the librarian (the chipset) deals with shelves of book. So what is for you book #10, is book #2 or of the shelf #2 but you never see the shelves number. That's the job of the librarian to go to the good shelf and bring you the good book.
For me (another program on the same computer) book #10 could be a different one : book #2 of the shelf 100002 (because my page start at shelf 10000)
We can both refer to 32^2 different book, but they are not the same (and the library can have much more than that).
(Change have changed lots since I studied computer, so what I'm saying can be not 100 % accurate, but I think the idea is there)

Resources