Inquiry on printf - memory

when we print an address of a variable, which address gets printed?
if it is virtual memory, then why is it so?
can any one explain some more...

On modern desktop/server OSs, all memory is virtual memory. I'm not aware of any way to access the underlying physical addresses from outside of the kernel. Even if it is possible, it's not going to be useful in the vast majority of situations.
So, if you do printf("%p", (void*)&variable); it will print the virtual address of variable for the current process.

The virtual memory address gets printed, and it is so because you don't have any need for the physical address, and the whole point of the OS is to prevent you from having to deal with physical addresses (well it's not only that, but it's also that :D).

On a normal PC computer, it is a value which you get if you convert poitner to integer of the same size.
void *p = something;
int i = *(int*)p;
printf("%x", i);
Memory address is virtual, yes of course, because that's how the process executing your code addresses the memory in your comupter. The process cannot see physical memory.

Related

pagination - virtual addresses, physical addresses, mapping - considerations

Pagination in some processor make it possible to map virtual address
(A2345678) to physical address (823C5678). However, it is not possible
to map virtual address (345678) to (2ABC678). What can we conclude
about size of frame, page, size of virtual memory, size of physical
memory.
What I think about it:
(A2345678) -> (823C5678)
So, size offset is most 19 bits. We know that size of page (and frame) has size at most 219, like in my previous question.
When it comes to size of virtual memory, and physical memory - I can conclude nothing.
Similary, I don't know what tell me information about non-possibility mapping address.
Can you try to explain it me ?
I do see something we can conclude after all:
If a mapping to physical address 0x823C5678 is possible, physical memory is at least that large. (Assuming there aren't any holes in physical address space; not a good assumption on real hardware, but whatever. We can tell that physical address space is at least that big, even if it doesn't all map to DRAM or MMIO).
Similarly, the valid virtual address 0xA2345678 gives us a lower bound on the virtual address size. Presumably all the virtual address bits can be 1, so the highest possible virtual address is at least 0xFFFFFFFF. i.e. virtual addresses are at least 32 bits, but could be any larger size.
This reasoning applies to physical address space, but not the size of physical memory. (e.g. in a computer with 19GiB of RAM, the highest valid physical address isn't 2n-1.)
The fact that you can't map 0x345678 to 0x2ABC678 does tell us that the page size is greater than 212. The physical address is below the address that was mappable, so we can rule out that possible reason for the mapping being impossible. I think too high and misaligned are the only possible reasons for a mapping not being possible.
(0xc = 0b1100, while 0x5 is 0b0101, so the common bits are only 0x678.)
We can assume that physical memory is a whole number of pages, so we can round up the lowest possible end of physical memory to the next multiple of 213.

Accessing other program's memory

I have been experimented around accessing memory used by other programs and I've encountered a little bit strange (to me) results.
First I have created a variable in my first program and gave it the value of 10. Then I looked at the address of it and asigned it manualy to a pointer in my second program. After that i tried to derefrence the pointer and (to my surprise) the program didn't crash. Instead it printed derefrenced pointer's value as 0
Next I created a few other programs to experiment with this. In my first program I created a pointer and asigned 'new int' to it. Then I checked the address of the int and manually asigned it to another pointer in my second program. Now when i tried to derefrence the ptr of my second program it did crash.
Could someone explain why the difference happened? And why was the derefrenced pointer 0?
Sorry for a possibly stupid question :/
This is because the addresses that your program prints for you to see are virtual addresses. Virtual addresses are relative to the memory space of each individual program. They get converted to physical memory addresses by the operating system during runtime.
So you didn't really access the real (physical) memory address of one of your programs from another one. This is also why the pointer value was set to 0.

How can I know when a memory request is forwarded to another numa node, which node is it?

When a memory access happens in node A, but it is a remote access, which is forwarded to node B, through QuickPath Interconnect controller.
Different node has different range of memory address, so of course I can use the memory address to identify this.
If I don't know the memory address, can I use some hardware register or performance counter to do this?
If you don't have address, use perf framework to collect overall statistics (events you seek to are called node-*). Tool will show number of misses/overall events for loads, stores and prefetches.
perf may be called from userspace, in per-thread collection mode, thus you may use rdpmc assembly instruction and read individual performance counter. I.e. read counter before and after memory access and calculate difference.
I created a small sample using my old code, but I can't test it right now :(
Here it is: https://gist.github.com/myaut/cd67ea5143615264b2e6
If you have address, you may use page_zone() and virt_to_page() kernel functions to get nodeid for address (where ptr is virtual address):
struct zone* z = page_zone(virt_to_page((void*) ptr));
return z->node;
I used this to track memory accesses in kernel using SystemTap

Contiguous blocks of memory and VM

I was reading up on Virtual Memory and from what I understand is that each process has its own VM table that maps VM addresses to Physical Addresses in real memory. So if a process allocated objects continuously they can potentially be stored in completely different places in Physical Memory. My question is that if I allocate and array which is supposed to be stored in a contiguous block of memory and if the size of the array requires more space than one page can provide, from what I understand is that array will be stored contiguously in VM but possibly in completely different location in PM. Is this correct? please correct me if I misunderstood how VM works. And if it is correct does that mean we are only concerned whether allocation is contiguous in VM?
Whether or not something that overlaps a page boundary is actually contiguous in Physical Memory is never really knowable with modern memory handlers. Memory glue logic essentially treats all addressable memory pages as an unordered set, and the ordering is essentially associated with a process; there's no guarantee that for different processes that end up getting assigned the same two physical memory pages (at different points in time) that the expressed relationship between those physical pages will be the same. Effectively, there's a translation layer between the CPU and the memory that handles this stuff.
That's right. Arrays must only looks contiguous for your application, but may be physically scattered on memory.
I just wanted to add/make it clear that from a user space program's point of view, a chunk of allocated memory always appears contiguous. The operating system in conjunction with the CPU's Memory Management Unit (MMU) handles all virtual to physical memory mappings and the programmer never needs to worry about how this mapping is handled (unless, of course, said programmer is writing an operating system).
A compiler (or one who writes code in assembly) can treat a program's addresses as starting from 0 and going up until the largest address needed for that particular program. The operating system then creates a page table for each process and uses this table to partially decode a physical address for each virtual memory location. The OS treats an address in a program as two separate parts, the page address and the offset into that page. Then, the MMU translates a page address into a physical frame address. Note that a physical memory "frame" is analogous to the conceptual "page" from the standpoint of the OS; these two are of the same size (eg 4096 bytes).
Since physical memory is divided into equally sized frames, and page size is the same as frame size you can know how much of your virtual address is used as a page location and how much is an offset into that page. For instance, if your OS "allocates" 4 gigabytes to each process (as is the case in Linux), and your page/frame size is 4096 bytes, you can know that 20 bits (4,294,967,296 bytes / 4096 bytes = 2 ^ 20 = 1,048,576 pages/page addresses) of a 32 bit address are used as a page address, which will then be converted to a physical frame address by the MMU, and the remaining 12 bits are used as an offset to determine the location of the address starting from the beginning of the page/frame.
VM (user pace) address --> page + offset (OS) --> frame + offset (MMU) = physical address

4 questions about processor architecture. (Computer engineering)

Our teachers has asked us around 50 true of false questions in preparation for our final exam. I could find an answer for most of them online or by asking relative. How ever, those 4 questions adrive driving me crazy. Most of those question aren't that hard, I just cant get any satisfying answer anywhere. Sorry, the original question are not written in english, i had to translate them myself. If you don't understand something, please tell me.
Thanks!
True or false
The size of the manipulated address by the processor determines the size of the virtual memory. How ever, the size of the memory cache is independent.
For long, DRAM technology stayed imcompatible with CMOS technology used to do the standard logic in processor. This is the reason DRAM memory is (most of the time) used outside of the processor (on a different chip).
Pagination let correspond multiple virtual addressing space to a same space of physical addressing.
An associative cache memory with sets of 1 line is an entierly associative cache memory, because one memory block can go in any set since each sets are of the same size that of the block.
"Manipulated address" is not a term of the art. You have an m-bit virtual address mapping to an n-bit physical address. Yes, a cache may be of any size up to the physical address size, but typically is much smaller. Note that cache lines are tagged with virtual or more typically physical address bits corresponding to the maximum virtual or physical address range of the machine.
Yes, DRAM processes and logic processes are each tuned for different objectives, and involve different process steps (different materials and thicknesses to lay down DRAM capacitor stacks/trenches, for example) and historically you haven't built processors in DRAM processes (except the Mitsubishi M32RD) nor DRAM in logic processes. Exception is so-called eDRAM that IBM likes to use for their SOI processes, and which is used as last level cache in IBM microprocessors such as the Power 7.
"Pagination" is what we call issuing a form feed so that text output begins at the top of the next page. "Paging" on the other hand is sometimes a synonym for virtual memory management, by which a virtual address is mapped (on a page by page basis) to a physical address. If you set up your page tables just so it allows multiple virtual addresses (indeed, virtual addresses from different processes' virtual address spaces) to map to the same physical address and hence the same location in real RAM.
"An associative cache memory with sets of 1 line is an entierly associative cache memory, because one memory block can go in any set since each sets are of the same size that of the block."
Hmm, that's a strange question. Let's break it down. 1) You can have a direct mapped cache, in which an address maps to only one cache line. 2) You can have a fully associative cache, in which an address can map to any cache line; there is something like a CAM (content addressible memory) tag structure to find which if any line matches the address. Or 3) you can have an n-way set associative cache, in which you have, essentially, n sets of direct mapped caches, and a given address can map to one of n lines. There are other more esoteric cache organizations, but I doubt you're being taught them.
So let's parse the statement. "An associative cache memory". Well that rules out direct mapped caches. So we're left with "fully associative" and "n-way set associative". It has sets of 1 line. OK, so if it is set associative, then instead of something traditional like 4-ways x 64 lines/way, it is n-ways x 1 lines/way. In other words, it is fully associative. I would say this is a true statement, except the term of the art is "fully associative" not "entirely associative."
Makes sense?
Happy hacking!
True, more or less (it depends on the accuracy of your translation I guess :) ) The number of bits in addresses sets an upper limit on the virtual memory space; you could, of course, choose not to use all the bits. The size of the memory cache depends on how much actual memory is installed, which is independent; but of course if you had more memory than you can address, then it still can't be used.
Almost certainly false. We have RAM on separate chips so that we can install more without building a whole new computer or replacing the CPU.
There is no a-priori upper or lower limit to the cache size, though in a real application certain sizes make more sense than others, of course.
I don't know of any incompatibility. The reason why we use SRAM as on-die cache is because it's faster.
Maybe you can force an MMUs to map different virtual addresses to the same physical location, but usually it's used the other way around.
I don't understand the question.

Resources