PCI BAR memory addresses - memory

Quick question, I was reading the OSDev Wiki page regarding PCI and it says the following -
"Base address Registers (or BARs) can be used to hold memory addresses used by the device, or offsets for port addresses. Typically, memory address BARs need to be located in physical ram while I/O space BARs can reside at any memory address (even beyond physical memory)."
I don't get where it says memory address BARs need to be located in physical ram? The whole point of MMIO is that when it gets assigned a memory address so that it will be routed to the device and not into physical RAM. What does it mean by it needs to be located in physical RAM?
Wouldn't it just be an address between the 3GB - 4GB address space, regardless of how much physical RAM is installed?
Is this an error on the OSDev site or have I misunderstood?
link - About halfway down, under the heading Base Address Registers
Thanks.

The OSDev site is ok. They describe memory/IO BARS from PCI device perspective, not from host perspective. So what OSDev is saying that memory BARs can be (but not necessarily are) mapped to physical RAM on PCI device. While IO BARs are usually something else (registers, FIFO, whatever).
Please also note, that the use of IO BARs is discouraged. It is better to use only memory BARs. Usually, you will have a small memory BAR that will group all the registers. And other BARs will exposes pieces of RAM of your PCI device.

Related

Confusion on Memory Layout vs Memory Management Schemes

I was studying some operating system concepts, got a bit confused, and now have the following question...
Does the memory layout of a program in execution (ie. text, data, stack, heap) only make sense in context of it's virtual address space? If a program is organized ("laid" out) into these logical sections in it's virtual address space, don't these sections just get messed up as soon as addresses start getting converted from virtual to physical addresses using a memory management scheme like paging or segmentation?
As far as I'm aware, these two schemes allow for non-contiguous partitioning in the physical address space. So if my "text" section was from address 0 to 100 (random size I picked) in the virtual address space, and I choose to use paging, and my page sizes were 20 addresses in length each (ie there would be 5 pages for the text section), once these pages get placed in the physical address space non-contiguously (based on wherever free space is available), wouldn't the notion of a TEXT "section" kinda not make sense anymore (as it's been chunked and scattered)?
Lastly, are the variable-sized segments in segmentation that end up in the physical address space the exact same size as the logical categories (text, data, stack, heap) of the memory layout present in the virtual space? Is the only caveat here that in the physical space the segments are scattered non-contiguously (are not adjacent to one another) but still exist wholesomely within their specific category (ie all the "data" remains together/contiguous in the physical space)?
Any help and clarification is greatly appreciated, thank you!
Does the memory layout of a program in execution (ie. text, data, stack, heap) only make sense in context of it's virtual address space? If a program is organized ("laid" out) into these logical sections in it's virtual address space, don't these sections just get messed up as soon as addresses start getting converted from virtual to physical addresses using a memory management scheme like paging or segmentation?
That's correct. The sections are contiguous in virtual memory, but not contiguous in physical memory. This isn't an issue since the operating system maintains page tables; the processor's MMU uses those to translate virtual to physical addresses transparently on each access, and the operating system itself can use them to figure out which (scattered) physical pages to interact with e.g. when the process ends and its memory is to be reclaimed.
As far as I'm aware, these two schemes allow for non-contiguous partitioning in the physical address space. So if my "text" section was from address 0 to 100 (random size I picked) in the virtual address space, and I choose to use paging, and my page sizes were 20 addresses in length each (ie there would be 5 pages for the text section), once these pages get placed in the physical address space non-contiguously (based on wherever free space is available), wouldn't the notion of a TEXT "section" kinda not make sense anymore (as it's been chunked and scattered)?
The idea of a section is still applicable in contexts where virtual addresses are applicable. Your user-mode program deals with virtual addresses (i.e. pointers essentially are virtual addresses), and a lot of the operating system still deals with virtual addresses as well. The translation to scattered physical addresses done on-demand by the MMU, and only a subset of kernel code needs to deal with physical addresses.
An aside: Those aren't realistic sizes due to the overhead of bookkeeping for pages; a typical page size is 4096 bytes, and there are ways of creating larger pages on some platforms to reduce this overhead further.
Lastly, are the variable-sized segments in segmentation that end up in the physical address space the exact same size as the logical categories (text, data, stack, heap) of the memory layout present in the virtual space? Is the only caveat here that in the physical space the segments are scattered non-contiguously (are not adjacent to one another) but still exist wholesomely within their specific category (ie all the "data" remains together/contiguous in the physical space)?
Nope, they are scattered on a page-by-page basis and not every virtual page will be backed with a physical page of memory. An example of this is e.g. due to demand paging where a page only gets a physical backing lazily when one is actually needed. Pages of .text that haven't been used yet might not be loaded from disk until a pagefault actually induces the kernel to load them from disk.
Likewise if physical memory is scarce, unused pages might be evicted from virtual memory and be placed onto disk; when they're next accessed a pagefault will induce the kernel to load them back in from disk.
A virtual address might also map to a physical address that doesn't represent a physical page of DRAM memory on a DIMM somewhere. It's possible to map virtual addresses to physical addresses that represent memory-mapped IO, or a page of virtual memory might be shared between two processes as a form of cooperative communication.
There are further tricks done for the sake of optimization. For example, Linux's fork syscall doesn't copy pages; rather it sets up the page tables to enable a feature called copy on write, where pages are only copied when either the parent or child writes to them, and pages which are only read are shared between the two.

Why is the memory address printed with {:p} much bigger than my RAM specs?

I want to print the memory location (address) of a variable with:
let x = 1;
println!("{:p}", &x);
This prints the hex value 0x7fff51ef6380 which in decimal is 140734568031104.
My computer has 16GB of RAM, so why this huge number? Does the x64 architecture use a big interval sequence instead of just simple 1 increment for accessing memory location?
In x86, usually the first location starts at 0, then 1, 2, etc. so the highest number you can have is around 4 billion, so the address number was always equals or less than 4 billion.
Why is this not the case with x64?
What you see here is an effect of virtual memory. Memory management is hard and it becomes even harder when the operating system and tens of hundreds of processes have to share the memory. In order to handle this huge complexity, the concept of virtual memory was used. I'll just briefly explain the basics here; the topic is far more complex and you should read about it somewhere else, too.
On most modern computers, each process thinks that it owns (almost) the complete memory space. But processes never deal with physical addresses, but with virtual ones. These virtual addresses are mapped to physical ones each time the process actually reads from memory. This translation of addresses is done by the so called MMU (memory management unit). The rules for how to map the addresses are setup by the operating system.
When you boot your PC, the operating system creates an initial mapping. Every time you start a process, the operating system adds a few slices of physical memory to the process and modifies the mapping appropriately. That way, the process has memory to play with.
On x86_64, the address space is 64 bit wide, so each process thinks it owns all of those 2^64 addresses. This is not true, of course:
There isn't a single PC on the world with that much memory. (In fact, most CPUs today can merely use 280 TB of RAM, since they internally can only use 48bit for addressing physical memory. And even these 280TB are enough for now, apparently.)
Even if you had that much memory, there are other processes which use part of that memory, too.
So what happens when you try to read an address which isn't mapped (which in 64bit land, are the vast majority of the addresses)? The MMU triggers a page fault. This makes the CPU notify the operating system to handle this.
What I mean is that in x86, usually first location starts at 0, then 1, 2, etc. so the highest number you can have is around 4 billion.
That is true, but it is also true if your x86 system has less than 4GB of RAM. Virtual memory exists for quite some time already.
So that's a short summary of why you see such big addresses. Again, please note that I glossed over many details here.
The pointers your program works with are in virtual address space. x86-64 uses 64-bit pointers. This was one of the major goals of AMD64, along with adding more integer and XMM registers. You are correct that i386 only has 32-bit pointers which only cover 4GB of address space in each process.
0x7fff51ef6380 looks like a stack pointer, which I guess makes sense for that code.
Linux on x86-64 (for example) puts the stack near the top of the lower canonical address range: current x86-64 hardware only implements 48-bit virtual addresses and this is the mechanism to prevent software from depending on it. This allows the address space to be extended in the future without breaking software.
The amount of phyiscal RAM in your system has nothing to do with this. You'd see (approximately) the same number on an x86-64 system with 128MB of RAM, +/- stack address space layout randomization (ASLR).

OPERATING SYSTEMS: what is the size of the virtual memory?

LINK 1: If size of the physical memory is 2^32-1, then what is the size of virtual memory?
the above link gives me an answer but i still do have some doubts.
pls answer in the way the questions posted here so that i will not be confused.....
1.Virtual memory is also called as Demand Paging whenever a page fault occurs
the operating system swaps the required page from the virtual memory. the virtual memory
here mean the harddisk or secondary storage. So how much space can be allocated for a
porcess in virutal memory? can this size(the space allocated for each process in the
Virtual memory) exceeds the size of our RAM size? i mean if our RAM is 4GB then what is
the maximum size of the virtual memory you can have for a process?can we have 4GB of
virtual memory for every process or can we have more than 4GB for every process?
(if it needs)
2.is the Virtual memory size fixed or dynamic? How much space is allocated for this memory
and in the above link it is told that 2^48 is the size of virtual memory in 64 bit machine
why is it only 2^48 and how can once can say a number like that?
thank you
If size of the physical memory is 2^32-1, then what is the size of virtual memory?
The size of the virtual address space is independent of the size of the physical address space. There is no answer.
So how much space can be allocated for a porcess in virutal memory?
That depends upon hardware limits, system parameters, and process quotas.
can this size(the space allocated for each process in the Virtual memory) exceeds the size of our RAM size?
Yes and it frequently does.
i mean if our RAM is 4GB then what is the maximum size of the virtual memory you can have for a process?
It can be anything. The rams size does not control.
can we have 4GB of virtual memory for every process or can we have more than 4GB for every process?
Both
is the Virtual memory size fixed or dynamic?
Dynamic
How much space is allocated for this memory and in the above link it is told that 2^48 is the size of virtual memory in 64 bit machine why is it only 2^48 and how can once can say a number like that?
It could be a hardware limit for a specific processor.
Paging is the way that virtual addresses are converted into physical addresses. This is done via page tables.
On x86 in Long Mode (64 bit mode), the page tables allow for 48 bit virtual address spaces (as in, 2^48 max size). This limitation is due to the design of x86's long mode page tables. Paging uses a few bits at a time from pointers to determine where to go next in the page tables. Basically, page tables are a relatively shallow b-tree style tree that let you look up the physical address corresponding to a virtual address.
To convert virtual addresses to physical addresses Long Mode page tables (for small pages) first extract 9 bits from the virtual address, then 9 more, then 9 more, then 9 more to find the right page, and use the low 12 bits to find the precise byte being accessed, for 48 bits total.
(For large and huge pages, x86 skips the last 1 and 2 steps of paging respectively, to find the address of the large or huge page, and the unused low 21 or 30 bits are used to find the precise byte in that page)
Virtual address spaces aren't necessarily dynamic, depending on what is meant by dynamic. The address space is always 48 bits (so long as you aren't switching between modes, like from long mode to protected mode with paging enabled (I.e. 32 bit mode)). Virtual address spaces are almost always sparse, as in most canonical (valid) addresses don't point don't point anything useful. The page tables don't have mappings for most addresses (accesses to those addresses generate page faults, which on Linux are often bounced back to userspace as the SIGSEGV you know and love).
That said, virtual memory can be dynamic in that when a page fault occurs the kernel could map in that page. To implement swap, OSes will use extra space on disk to give the illusion of more RAM by writing infrequently used pages back to disk, and lazily pulling pages back into RAM.
Fun fact, page tables have no restriction preventing the same physical page from being mapped in multiple times. You could build a monstrous page table with every virtual address pointing into exactly the same page (which is crazy), but doable. This means address spaces aren't necessarily sparse, just very likely to be. (Note that this page table would be huge. I'm sure someone has done the calculation, but my first guess would be order of terabytes)

why cannot access to contiguous memory addresses in physical memory

According to Microsoft documentation in the following link :
https://msdn.microsoft.com/en-us/library/windows/hardware/hh439648%28v=vs.85%29.aspx
A program can use a contiguous range of virtual addresses to access a
large memory buffer that is not contiguous in physical memory.
So there's a question,that why in physical memory cannot have contiguous memory for a process?
Also there's another question due to the documentation, the following picture which demonstrates virtual memory for user and system space:
The system virtual address space is unique in the whole of the memory but there's a virtual address space for each process ?
Thanks.
At first when a process is loaded into memory, the OS can optimize to load process pages contiguously to physical memory.The process pages in memory cant always be contiguous due to swapping in and out, because there are other processes and things in memory that occupy space,so if later when some process pages becomes less used it is swapped back to hard drive, and when it is needed again it is not guaranteed to be loaded to the same spot before swapping out because there can be another process page laying there. You should read about virtual memory to gain good understanding of all of this.
You'r Questionn is simple!you have asked why we can have large memory buffer in virtual memory but not in physical one! thats because we are limited to the hardware!if we were able to access as much as buffer we want on our physical memory,industries had to make like 1024GB memories for our satisfaction! but we are using 8GB memory and we are satisfy...!virtual memories exist to satisfy our needs and make hardwares much more efficient!
hope it helps <3

I/O-mapped I/O - are port addresses a part of the RAM

In I/O-mapped I/O (as opposed to memory-mapped I/O), a certain set of addresses are fixed for I/O devices. Are these addresses a part of the RAM, and thus that much physical address space is unusable ? Does it correspond to the 'Hardware Reserved' memory in the attached picture ?
If yes, how is it decided which bits of an address are to be used for addressing I/O devices (because the I/O address space would be much smaller than the actual memory. I have read this helps to reduce the number of pins/bits used by the decoding circuit) ?
What would happen if one tries to access, in assembly, any address that belongs to this address space ?
I/O mapped I/O doesn't use the same address space as memory mapped I/O. The later does use part of the address space normally used by RAM and therefore, "steals" addresses that no longer belong to RAM memory.
The set of address ranges that are used by different memory mapped I/O is what you see as "Hardware reserved".
About how is it decided how to address memory mapped devices, this is largely covered by the PnP subsystem, either in BIOS, or in the SO. Memory-mapped devices, with few exceptions, are PnP devices, so that means that for each of them, its base address can be changed (for PCI devices, the base address of the memory mapped registers, if any, is contained in a BAR -Base Address Register-, which is part of the PCI configuration space).
Saving pins for decoding devices (lazy decoding) is (was) done on early 8-bit systems, to save decoders and reduce costs. It haven't anything to do with memory mapped / IO mapped devices. Lazy decoding may be used in both situations. For example, a designer could decide that the 16-bit address range C000-FFFF is going to be reserved for memory mapped devices. To decide whether to enable some memory chip, or some device, it's enough to look at the value of A15 and A14. If both address lines are high, then the block addressed is C000-FFFF and that means that memory chip enables will be deasserted. On the other hand, a designer could decide that the 8 bit IO port 254 is going to be assigned to a device, and to decode this address, it only looks at the state of A0, needing no decoders to find out the port address (this is for example, what the ZX Spectrum does for addressing the ULA)
If a program (written in whatever language that allows you to access and write to arbitrary memory locations) tries to access a memory address reserved for a device, and assuming that the paging and protection mechanism allows such access, what happens will depend solely on what the device does when that address is accessed. A well known memory mapped device in PC's is the frame buffer. If the graphics card is configured to display color text mode with its default base address, any 8-bit write operation performed to even physical addresses between B8000 and B8F9F will cause the character whose ASCII code is the value written to show on screen, in a location that depends on the address chosen.
I/O mapped devices don't collide with memory, as they use a different address space, with different instructions to read and write values to addresses (ports). These devices cannot be addressed using machine code instructions that targets memory.
Memory mapped devices share the address space with RAM. Depending on the system configuration, memory mapped registers can be present all the time, using some addresses, and thus preventing the system to use them for RAM, or memory mapped devices may "shadow" memory at times, so allowing the program to change the I/O configuration to choose if a certain memory region will be decoded as in use by a device, or used by regular RAM (for example, what the Commodore 64 does to let the user have 64KB of RAM but allowing it to access device registers some times, by temporarily disabling access to the RAM that is "behind" the device that is currently being accessed at that very same address).
At the hardware level, what is happening is that there are two different signals: MREQ and IOREQ. The first one is asserted on every memory instruction, the second one, on every I/O insruction. So this code...
MOV DX,1234h
MOV AL,[DX] ;reads memory address 1234h (memory address space)
IN AL,DX ;reads I/O port 1234h (I/O address space)
Both put the value 1234h on the CPU address bus, and both assert the RD pin to indicate a read, but the first one will assert MREQ to indicate that the address belong to the memory address space, and the second one will assert IOREQ to indicate that it belongs to the I/O address space. The I/O device at port 1234h is connected to the system bus so that it is enabled only if the address is 1234h, RD is asserted and IOREQ is asserted. This way, it cannot collide with a RAM chip addressed at 1234h, because the later will be enabled only if MREQ is asserted (the CPU ensures that IOREQ and MREQ cannot be asserted at the same time).
These two address spaces don't exist in all CPU's. In fact, the majority of them don't have this, and therefore, they have to memory map all its devices.

Resources