I am doing some baremetal programming on x86. I am in 16 bit mode and I emulate using qemu... I use int 0x15/ax = 0xE820 to view the memory map... Using the qemu monitor (to examine the region where I wrote the map) I get the following information about the memorymap:
Start=0 Length= 0x9fc00 (type 1)
Start=0x9fc00 Length= 0x400 (type 2)
Start=0xf0000 Length= 0x10000 (type 2)
Start=0x100000 Length= 0x7ee0000 (type 1)
Start=0x7fe0000 Length= 0x10000 (type 2)
Start=0xfffc0000 Length = 0x40000 (Type 2)
So, are the segments that are not in this map (Eg. 0xA0000) belong to something like MMIO? I know that type 2 memory regions are used by BIOS, etc. ( Start=0x9fc00 Length= 0x400 (type 2) is EDBA) What happens when I write to these regions?
Note: I don't have type 3(ACPI) probably because I'm using qemu. In real hardware I most probably should get type 3 regions too. Is that RAM also?
This BIOS call is filly described in
INT 15h, AX=E820h - Query System Address Map,
where it is noted in the section "Assumptions and Limitations":
The BIOS will return address ranges describing base board memory and ISA or PCI memory that is contiguous with that baseboard memory.
The BIOS WILL NOT return a range description for the memory mapping of PCI devices, ISA Option ROM's, and ISA plug & play cards. This is
because the OS has mechanisms available to detect them.
The BIOS will return chipset defined address holes that are not being used by devices as reserved.
Address ranges defined for base board memory mapped I/O devices (for example APICs) will be returned as reserved.
All occurrences of the system BIOS will be mapped as reserved. This includes the area below 1 MB, at 16 MB (if present) and at end of the
address space (4 gig).
Standard PC address ranges will not be reported. Example video memory at A0000 to BFFFF physical will not be described by this
function. The range from E0000 to EFFFF is base board specific and
will be reported as suits the bas board.
All of lower memory is reported as normal memory. It is OS's responsibility to handle standard RAM locations reserved for specific
uses, for example: the interrupt vector table(0:0) and the BIOS data
area(40:0).
The Wikipedia article
Detecting Memory (x86)
add in its section
"BIOS Function: INT 0x15, EAX = 0xE820"
adds the following rules:
After getting the list, it may be desirable to: sort the list, combine adjacent ranges of the same type, change any overlapping areas
to the most restrictive type, and change any unrecognised "type"
values to type 2.
Type 3 "ACPI reclaimable" memory regions may be used like (and combined with) normal "available RAM" areas as long as you're finished
using the ACPI tables that are stored there (i.e. it can be
"reclaimed").
Types 2, 4, 5 (reserved, ACPI non-volatile, bad) mark areas that should be avoided when you are allocating physical memory.
Treat unlisted regions as Type 2 -- reserved.
Your code must be able to handle areas that don't start or end on any sort of "page boundary".
Conclusion: The ranges that are not found in the map are functional,
perhaps occupied by device memory such as video.
Related
I think I have a fairly basic MIPS question but am still getting my head wrapped around how addressing works in mips.
My question is: What is the address of the register $t0?
I am looking at the following memory allocation diagram from the MIPs "green sheet"
I had two ideas:
The register $t0 has a register number of 8 so I'm wondering if it would have an address of 0x0000 0008 and be in the reserved portion of the memory block.
Or would it fall in the Static Data Section and have an address of 0x1000 0008?
I know that MARS and different assemblers might start the addressing differently as described in this related question:
How is the la instruction translated in MIPS?
I trying to understand what the "base" address is for register $t0 so I have a better understanding how offsets(base) work.
For example what the address of 8($t0) would be
Thanks for the help!
feature
Registers
Memory
count
very few
vast
speed
fast
slow
Named
yes
no
Addressable
no
yes
There are typically 32 or fewer registers for programmers to use. On a 32-bit machine we can address 2^32 different bytes of memory.
Registers are fast, while memory is slow, potentially taking dozens of cycles depending on cache features & conditions.
On a load-store machine, registers can be used directly in most instructions (by naming them in the machine code instruction), whereas memory access requires a separate load or store instruction. Computational instructions on such a machine typically allows naming up to 3 registers (e.g. target, source1, source2). Memory operands have to be brought into registers for computation (and sometimes moved back to memory).
Register can be named in instructions, but they do not have addresses and cannot be indexed. On MIPS no register can be found as alias at some address in memory. It is hard to put even a smallish array (e.g. array of 10) in registers because they have no addresses and cannot be indexed. Memory has numerical addresses, so we can rely on storing arrays and objects in a predictable pattern of addresses. (Memory generally doesn't have names, just addresses; however, there are usually special memory locations for working with I/O various devices, and, as you note memory is partitioned into sections that have start (and ending) addresses.)
To be clear, memory-based aliases have been designed into some processors of the past. The HP/1000 (circa 70s'-80's), for example, had 2 registers (A & B), and they had aliases at memory locations 0 and 1, respectively. However, this aliasing of CPU registers to memory is generally no longer done on modern processors.
For example what the address of 8($t0) would be
8($t0) refers to the memory address of (the contents of register $t0) + 8. With proper usage, the program fragment would $t0 would be using $t0 as a pointer, which is some variable that holds a memory address.
How to exhaustively detect or probe or scan usable/accessible physical memory?
I'm currently making a custom bootloader in NASM for a x86_64 custom operating system.
In order to assign which physical address to contain which data, I want to make sure that the memory is guaranteed free for use. I've already tried BIOS interrupt int 0x15 eax 0xE820 and checked out device manager memory resources.
The problem is that none of them covers fully.
For example,
it says 0x0000000000000000 ~ 0x000000000009FC00 is usable.
But strictly speaking, 0x0000000000000000 ~ 0x0000000000000500 is not usable because it stores IVT and BDA.
Also, there are PCI holes here and there.
My objective here is detecting or probing or scanning entire memory available in my hardware and make a memory map so that I can distinguish which address is which. (Example map below)
0x0000000000000000 ~ 0x00000000000003FF : Real Mode IVT
0x0000000000000400 ~ 0x00000000000004FF : BDA
...
0x0000000000007C00 ~ 0x0000000000007DFF : MBR Load Address
...
0x00000000000B8000 ~ 0x00000000000B8FA0 : VGA Color Text Video Memory
...
0x00000000C0000000 ~ 0x00000000FFFFFFFF : PCI Space
My processor is intel i7-8700K 8th gen.
How much information do you want?
If you only want to know which areas are usable RAM; then "int 0x15, eax=0xE820" (with the restriction that the BDA will be considered usable), or UEFI's "get memory map" function, are all you need. Note that for both cases one/some areas may be reported as "ACPI reclaimable", which means that after you've finished parsing ACPI tables (or if you don't care about ACPT tables) the RAM will become usable.
If you want more information you need to do more work, because the information is scattered everywhere. Specifically:
ACPI's SRAT table describes which things (e.g. which areas of memory) are in which NUMA domain; and ACPI's SLIT table describes the performance implications of that.
ACPI's SRAT table also describes which things (e.g. which areas of memory) are "hot plug removable" and reserved for "hot insert".
the CPU's CPUID instruction will tell you "physical address size in bits". This is useful to know if/when you're trying to find a suitable area of the physical address space to use for a memory mapped PCI device's BARs, because the memory maps you get are too silly to tell you the difference between "not usable by memory mapped PCI devices", "usable by memory mapped PCI devices" and "used by memory mapped PCI devices".
parsing (or configuring) PCI configuration space (in conjunction with IOMMUs if necessary) tells you which areas of the physical address space are currently used by which PCI devices
parsing "System Management BIOS" tables can (with a lot of work and "heuristical fumbling") tell you which areas of the physical address space correspond to which RAM chips on the motherboard and what the details of those RAM chips are (type, speed, etc).
various ACPI tables (e.g. MADT/APIC and HPET) can be used to determine the location of various special devices (local APICs, IO APICs, HPET).
you can assume that (part of) the area ending at physical address 0xFFFFFFFF will be the firmware's ROM; and (with some more "heuristical fumbling" to subtract any special devices from the area reported as "reserved" by the firmware's memory map) you can determine the size of this area.
If you do all of this you'll have a reasonably complete map describing everything in the physical address space.
I worked with megafunctions to generate 32bit data memory in the fpga.but the output was addressed 32bit (4 bytes) at time , how to do 1 byte addressing ?
i have Altera Cyclone IV ep4ce6e22c8.
I'm designing a 32bit CPU in fpga ,
Nowadays every CPU address bus works in bytes. Thus to access your 32-bit wide memory you should NOT connect the LS 2 address bits. You can use the A[1:0] address bits to select a byte (or half word using A[1] only) from the memory when your read.
You still will need four byte write enable signals. This allows you to write word, half-words or bytes.
Have a look at existing CPU buses or existing connection standards like AHB or AXI.
Post edit:
but reading address 0001 , i get 0x05060708 but the desired value is 0x02030405.
What you are trying to do is read a word from a non-aligned address. There is no existing 32-bit wide memory that supports that. I suggest you have a look at how a 32-bit wide memory works.
The old Motorola 68020 architecture supported that. It requires a special memory controller which first reads the data from address 0 and then from address 4 and re-combines the data into a new 32-bit word.
With the cost of memory dropping and reducing CPU cycles becoming more important, no modern CPU supports that. They throw an exception: non-aligned memory access.
You have several choices:
Build a special memory controller which supports unaligned accesses.
Adjust your expectations.
I would go for the latter. In general it is based on the wrong idea how a memory works. As consolidation: You are not the first person on this website who thinks that is how you read words from memory.
I was reading about memory architecture and I got a bit confused with the paging and segmentation. I read that modern OS systems use only paging to manage memory access but looking at a disassembled codes I can see segments like "ds" and "fs". Does it means that the OS (saw that on Windows and Linux) is using both segmentation and paging or is it just mapping all the segments into the same pages (making segments irrelevant) ?
Ok, based on the book Modern Operating Systems 3rd Edition by Andrew S. Tanenbaum and the materials form Open Security Training (opensecuritytraining.info), i manage to understand the segmentation and paging and the answer for my question is:
Concepts:
1.1.Segmentation:
Segmentation is the division of the memory into pieces (sections) called segments. These segments are independents from each other, have variable sizes and may grow as needed.
1.2. Virtual Memory:
A virtual memory is an abstraction of real memory. This means that it maps a virtual address (used by programs) into a physical address (used by the hardware). If a program wants to access the memory 2000 (mov eax, 2000), the virtual address (2000) will be converted into the real physical address ( for example 1422) and then the program will access the memory 1422 thinking that he’s accessing the memory 2000.
So, if virtual memory is being used by the system, programs no longer access real memory directly, instead, they used the correspondent virtual memory.
1.3. Paging:
Paging is a virtual memory management scheme. As explained before, it maps a virtual address into a physical address. Paging divides the virtual memory into pieces called “pages” and also divides the physical memory into pieces called “frame pages”. A page can be bound to one or more frame page ( keep in mind that a page can map different frame pages, but just one at the time)
Advantages and Disadvantages
2.1. Segmentation:
The main advantage of using segmentation is to divide the memory into different linear address spaces. With this programs can have an address space for his code, other for the stack, another for dynamic variables (heap) etc. The disadvantage for segmentation is the fragmentation of the memory.
Consider this example – One portion of the memory is divided into 4 segments sequentially allocated -> seg1(in use) --- seg2(in use)--- seg3(in use)---seg4(in use). Now if we free the memory from seg2, we will have -> seg1(in use) --- seg2(FREE)--- seg3(in use)---seg4(in use). If we want to allocate some data, we can do it by using seg2, but if the size of data is bigger than the size of the seg2, we won’t be able to do so and the space will be wasted and the memory fragmented. Another problem is that some segments could have a larger size and since this segment can’t be “broken” into smaller pieces, it must be fully allocated in memory.
2.1. Paging:
The main advantages of using paging is to abstract the physical memory, hence, the programs (and programmers) don’t need to bother with memory addresses. You can have two programs that access the (virtual) memory 2000 because it will be mapped into two different physical memories. Also, it’s possible to use the hard disk to ensure that only the necessary pages are allocated into memory. The main disadvantage is that paging uses only one linear address space. Paging maps virtual memories from 0 to the max addressable value for all programs. Now consider this example - two chunks of memory, “chk1” and “chk2” are next to each other on virtual memory ( chk1 is allocated from address 1000 to 2000 and chk2 uses from 2001 to 3000), if chk1 needs to grow, the OS needs to make room in the memory for the new size of chk1. If that happens again, the OS will have to do the same thing or, if no space can be found, an exception will pop. The routines for managing this kind of operations are very bad for performance, because it this growth and shrink of the memory happens a lot of time.
Segmentation and Paging
3.1. Why combine both?
An operational systems can combine both segmentation and paging. Why? Because combining then it’s possible to have the best from both without their disadvantages. So, the memory is divided in many segments and each one of those have one or more pages. When a program tries to access a memory address, first the OS goes to the correspondent segment, there he’ll find the correspondent page, so he goes to the page and there he’ll find the frame page that the program wanted to access. Using this approach, the OS divides the memory into various segments, including segments for the kernel and user programs. These segments have variable sizes and access protection features. To solve the fragmentation and the “big segment” problems, paging is used. With paging a larger segment is broken into several pages and only the necessary pages remains in memory (and more pages could be allocated in memory if needed). So, basically, the Modern OSs have two memory abstractions, where segmentation is more used for “handling the programs” and Paging for “managing the physical memory”.
3.2. How it really works?
An OS running segmentation and Paging will have the following structures:
3.2.1. Segment Selector:
This represents an index on the Global/Local Descriptor Table. It contains 3 fields that represent the index on the descriptor table, a bit to identify if that segment is present on Global or Local descriptor table and a privilege level.
3.2.2. Segment Register:
A CPU register used to store segment selector. Usually ( on a x86 machine) there is at least the register CS (Code Segment) and DS (Data Segment).
3.2.3. Segment descriptors:
This structure contains the data regarding a segment, like his base address, size ( in pages or in bytes), the access privilege, the information of if this segment is present on the memory or not, etc. (for all the fields, search for segment descriptors on google)
3.2.4. Global/Local Descriptor Table:
The table that contains several segment descriptors. So this structure holds all the segment descriptors for the system. If the table is Global, it can hold other things like Local Descriptor Tables descriptors, Task State Segment Descriptors and Call Gate descriptors (I’ll not explain these here, please, google them). If the table is Local, it will only ( as far as I know) holds user related segment descriptors.
3.3. How it works?
So, to access a memory, first the program needs to access a segment. To do this a segment selector is loaded into a segment register and then a Global or Local Descriptor table (depending of field on the segment selector). This is the reason of the fully memory address be SEGMENT REGISTER: ADDRESS , like CS:ADDRESS -> 001B:0044BF7A. Now the OS goes to the G/LDT and (using the index field of the segment selector) finds the segment descriptor of the address trying to access. Then it checks if the segment if present, the protection and if everything is ok it goes to the address stated on the “base field” (of the descriptor) + the address offset . If paging is not enabled, the system goes directly into the real memory, but with paging on the address is treated as a virtual address and it goes to the Page directory. The base address + the offset are called Linear Address and will be interpreted as 3 fields: Directory+Page+offset. So on the Directory page, it will search for the directory entry specified on the “directory” field of the linear address, this entry points to the page table and the field “page” of the linear address is used to find the page, this entry points to a frame page and the offset is used to find the exactly address that the program want to access.
3.4. Modern Operational Systems
Modern OSes "do not use" segmentation. Its in quotes because they use 4 segments: Kernel Code Segment, Kernel Data Segment, User Code Segment and User Data Segment. What does it means is that all user's processes have the same code and data segments (so the same segment selector). The segments only change when going from user to kernel.
So, all the path explained on the section 3.3. occurs, but they use the same segments and, since the page tables are individual per process, a page fault is difficult to happen.
Hope this helps and if there is any mistakes or some more details ( I was I bit generic on some parts) please, feel free to comment and reply.
Thank you guys
Danilo PC
They only use paging for memory protection, while they use segmentation for other purposes (like storing thread-local data).
I was reading up on Virtual Memory and from what I understand is that each process has its own VM table that maps VM addresses to Physical Addresses in real memory. So if a process allocated objects continuously they can potentially be stored in completely different places in Physical Memory. My question is that if I allocate and array which is supposed to be stored in a contiguous block of memory and if the size of the array requires more space than one page can provide, from what I understand is that array will be stored contiguously in VM but possibly in completely different location in PM. Is this correct? please correct me if I misunderstood how VM works. And if it is correct does that mean we are only concerned whether allocation is contiguous in VM?
Whether or not something that overlaps a page boundary is actually contiguous in Physical Memory is never really knowable with modern memory handlers. Memory glue logic essentially treats all addressable memory pages as an unordered set, and the ordering is essentially associated with a process; there's no guarantee that for different processes that end up getting assigned the same two physical memory pages (at different points in time) that the expressed relationship between those physical pages will be the same. Effectively, there's a translation layer between the CPU and the memory that handles this stuff.
That's right. Arrays must only looks contiguous for your application, but may be physically scattered on memory.
I just wanted to add/make it clear that from a user space program's point of view, a chunk of allocated memory always appears contiguous. The operating system in conjunction with the CPU's Memory Management Unit (MMU) handles all virtual to physical memory mappings and the programmer never needs to worry about how this mapping is handled (unless, of course, said programmer is writing an operating system).
A compiler (or one who writes code in assembly) can treat a program's addresses as starting from 0 and going up until the largest address needed for that particular program. The operating system then creates a page table for each process and uses this table to partially decode a physical address for each virtual memory location. The OS treats an address in a program as two separate parts, the page address and the offset into that page. Then, the MMU translates a page address into a physical frame address. Note that a physical memory "frame" is analogous to the conceptual "page" from the standpoint of the OS; these two are of the same size (eg 4096 bytes).
Since physical memory is divided into equally sized frames, and page size is the same as frame size you can know how much of your virtual address is used as a page location and how much is an offset into that page. For instance, if your OS "allocates" 4 gigabytes to each process (as is the case in Linux), and your page/frame size is 4096 bytes, you can know that 20 bits (4,294,967,296 bytes / 4096 bytes = 2 ^ 20 = 1,048,576 pages/page addresses) of a 32 bit address are used as a page address, which will then be converted to a physical frame address by the MMU, and the remaining 12 bits are used as an offset to determine the location of the address starting from the beginning of the page/frame.
VM (user pace) address --> page + offset (OS) --> frame + offset (MMU) = physical address