what type of memory do registers have? - memory

Accumulator(AC), Data Register(DR), Address Register(AR), Program Counter(PC), Memory Data Register (MDR), Index Register(IR), Memory Buffer Register(MBR)
these are different registers in a compute.
what type of memory are they? random access, read only, or any other?

what type of memory are they?
Registers do not have memory. Registers are storage locations within the CPU.
these are different registers in a computer?
The register set varies among processors.
random access, read-only, or any other?
General purpose registers are read/write. However, there are registers that are generally read only. The processor status register is usually read-only in the sense that you cannot write to it. However, you can change its value through instructions. Some processors might have write-only registers.

Related

The location of EIP and other registers in x86

I am working with x86 instructions and now I'm confused about where the x86 registers (Like EIP, ESP, etc.) are stored
For example when I use ollydbg I could see what is the actual EIP register alue and how it changes.
If they're stored in memory then where is the actual location? (For example in .data .text or .bss)
And can I change the EIP of another process manually? How?
You have a severe misconception about what a register is.
A register is actually a register, ie. a really small piece of memory in the processor that can contain the operands or can be the target of a CPU instruction. It doesn't have an address in memory - it's really adressable as the register it is.
RAM is something totally different – an x86 program can work completely without RAM, but there's no operation that doesn't work on registers. For example, to add two numbers that are somewhere in RAM, you use LOAD instructions to load these two numbers into two register, and then some ADD instruction to add one number to the other, targeting a register, and then there's some STORE instruction that takes the register value and writes it to some address in RAM.
So, there's no "process-specific" registers. Every CPU core has exactly one set of registers (some specialities like virtualization nonwithstanding), and there's mechanisms to store registers in RAM, and restore them from RAM, for example when calling a function or switching contextes.
Registers are stored in registers, not in the process's own memory.
Debuggers use a special interface provided by the OS to change registers of a running process, including EIP. In Linux, it's the ptrace(2) API.
Being able to change a process's registers from outside the process is related to how the OS saves a process's architectural state to memory for context switches.

Kernel memory management: where do I begin?

I'm a bit of a noob when it comes to kernel programming, and was wondering if anyone could point me in the right direction for beginning the implementation of memory management in a kernel setting. I am currently working on a toy kernel and am doing a lot of research on the subject but I'm a bit confused on the topic of memory management. There are so many different aspects to it like paging and virtual memory mapping. Is there a specific order that I should implement things or any do's and dont's? I'm not looking for any code or anything, I just need to be pointed in the right direction. Any help would be appreciated.
There are multiple aspects that you should consider separately:
Managing the available physical memory.
Managing the memory required by the kernel and it's data structures.
Managing the virtual memory (space) of every process.
Managing the memory required by any process, i.e. malloc and free.
To be able to manage any of the other memory demands you need to know actually how much physical memory you have available and what parts of it are available to your use.
Assuming your kernel is loaded by a multiboot compatible boot loader you'll find this information in the multiboot header that you get passed (in eax on x86 if I remember correctly) from the boot loader.
The header contains a structure describing which memory areas are used and which are free to use.
You also need to store this information somehow, and keep track of what memory is allocated and freed. An easy method to do so is to maintain a bitmap, where bit N indicates whether the (fixed size S) memory area from N * S to (N + 1) * S - 1 is used or free. Of course you probably want to use more sophisticated methods like multilevel bitmaps or free lists as your kernel advances, but a simple bitmap as above can get you started.
This memory manager usually only provides "large" sized memory chunks, usually multiples of 4KB. This is of course of no use for dynamic memory allocation in style of malloc and free that you're used to from applications programming.
Since dynamic memory allocation will greatly ease implementing advanced features of your kernel (multitasking, inter process communication, ...) you usually write a memory manager especially for the kernel. It provides means for allocation (kalloc) and deallocation (kfree) of arbitrary sized memory chunks. This memory is from pool(s) that are allocated using the physical memory manager from above.
All of the above is happening inside the kernel. You probably also want to provide applications means to do dynamic memory allocation. Implementing this is very similar in concept to the management of physical memory as done above:
A process only sees its own virtual address space. Some parts of it are unusable for the process (for example the area where the kernel memory is mapped into), but most of it will be "free to use" (that is, no actually physical memory is associated with it). As a minimum the kernel needs to provide applications means to allocate and free single pages of its memory address space. Allocating a page results (under the hood, invisible to the application) in a call to the physical memory manager, and in a mapping from the requested page to this newly allocated memory.
Note though that many kernels provide its processes either more sophisticated access to their own address space or directly implement some of the following tasks in the kernel.
Being able to allocate and free pages (4KB mostly) as before doesn't help with dynamic memory management, but as before this is usually handled by some other memory manager which is using these large memory chunks as pool to provide smaller chunks to the application. A prominent example is Doug Lea's allocator. Memory managers like these are usually implemented as library (part of the standard library most likely) that is linked to every application.

memory management and segmentation faults in modern day systems (Linux)

In modern-day operating systems, memory is available as an abstracted resource. A process is exposed to a virtual address space (which is independent from address space of all other processes) and a whole mechanism exists for mapping any virtual address to some actual physical address.
My doubt is:
If each process has its own address space, then it should be free to access any address in the same. So apart from permission restricted sections like that of .data, .bss, .text etc, one should be free to change value at any address. But this usually gives segmentation fault, why?
For acquiring the dynamic memory, we need to do a malloc. If the whole virtual space is made available to a process, then why can't it directly access it?
Different runs of a program results in different addresses for variables (both on stack and heap). Why is it so, when the environments for each run is same? Does it not affect the amount of addressable memory available for usage? (Does it have something to do with address space randomization?)
Some links on memory allocation (e.g. in heap).
The data available at different places is very confusing, as they talk about old and modern times, often not distinguishing between them. It would be helpful if someone could clarify the doubts while keeping modern systems in mind, say Linux.
Thanks.
Technically, the operating system is able to allocate any memory page on access, but there are important reasons why it shouldn't or can't:
different memory regions serve different purposes.
code. It can be read and executed, but shouldn't be written to.
literals (strings, const arrays). This memory is read-only and should be.
the heap. It can be read and written, but not executed.
the thread stack. There is no reason for two threads to access each other's stack, so the OS might as well forbid that. Moreover, the tread stack can be de-allocated when the tread ends.
memory-mapped files. Any changes to this region should affect a specific file. If the file is open for reading, the same memory page may be shared between processes because it's read-only.
the kernel space. Normally the application should not (or can not) access that region - only kernel code can. It's basically a scratch space for the kernel and it's shared between processes. The network buffer may reside there, so that it's always available for writes, no matter when the packet arrives.
...
The OS might assume that all unrecognised memory access is an attempt to allocate more heap space, but:
if an application touches the kernel memory from user code, it must be killed. On 32-bit Windows, all memory above 1<<31 (top bit set) or above 3<<30 (top two bits set) is kernel memory. You should not assume any unallocated memory region is in the user space.
if an application thinks about using a memory region but doesn't tell the OS, the OS may allocate something else to that memory (OS: sure, your file is at 0x12341234; App: but I wanted to store my data there). You could tell the OS by touching the end of your array (which is unreliable anyways), but it's easier to just call an OS function. It's just a good idea that the function call is "give me 10MB of heap", not "give me 10MB of heap starting at 0x12345678"
If the application allocates memory by using it then it typically does not de-allocate at all. This can be problematic as the OS still has to hold the unused pages (but the Java Virtual Machine does not de-allocate either, so hey).
Different runs of a program results in different addresses for variables
This is called memory layout randomisation and is used, alongside of proper permissions (stack space is not executable), to make buffer overflow attacks much more difficult. You can still kill the app, but not execute arbitrary code.
Some links on memory allocation (e.g. in heap).
Do you mean, what algorithm the allocator uses? The easiest algorithm is to always allocate at the soonest available position and link from each memory block to the next and store the flag if it's a free block or used block. More advanced algorithms always allocate blocks at the size of a power of two or a multiple of some fixed size to prevent memory fragmentation (lots of small free blocks) or link the blocks in a different structures to find a free block of sufficient size faster.
An even simpler approach is to never de-allocate and just point to the first (and only) free block and holds its size. If the remaining space is too small, throw it away and ask the OS for a new one.
There's nothing magical about memory allocators. All they do is to:
ask the OS for a large region and
partition it to smaller chunks
without
wasting too much space or
taking too long.
Anyways, the Wikipedia article about memory allocation is http://en.wikipedia.org/wiki/Memory_management .
One interesting algorithm is called "(binary) buddy blocks". It holds several pools of a power-of-two size and splits them recursively into smaller regions. Each region is then either fully allocated, fully free or split in two regions (buddies) that are not both fully free. If it's split, then one byte suffices to hold the size of the largest free block within this block.

How is external memory, internal memory, and cache organized?

Consider a system as follows: a hardware board having say ARM Cortex-A8 and Neon Vector coprocessor, and Embedded Linux OS running on Cortex-A8. On this environment, if some application - say, a video decoder - is executing, then:
How is it decided which buffers would be in external memory, which ones would be allocated in internal SRAM, etc.
When one calls calloc/malloc on such a system/code, the pointer returned is from which memory: internal or external?
Can a user make buffers to be allocated in the memories of his choice (internal/external)?
In ARM architectures, there is another memory called "tightly coupled memory" (TCM). What is that and how can user enable and use it? Can I declare buffers in this memory?
Do I need to see the memory map (if any) of the hardware board to understand about all these different physical memories present in a typical hardware board?
How much of a role does the OS play in distinguishing these different memories?
Sorry for multiple questions, but i think they all are interlinked.
Please note that I'm not familiar with the ARM nor embedded Linux's specifically, so all of my comments will be from a general point of view.
First, about cache: Very early during boot, the operating system will do some amount of cache initialization. Exactly what this entails will vary from processor to processor, but the net effect is to ensure cache is initialized properly, and then enable its use by the processor. After this, the cache is operated exclusively by the processor with no further interaction by the operating system or your programs.
Now, on to external (off-chip) and internal (on-chip) memories:
The operating system owns all hardware on the system, including the internal and external memories and so is ultimately responsible for discovering, configuring, and allocating these resources within the kernel and to user processes. In a typical system (eg, your desktop or a 1u server) there won't usually be any special internal (on-chip) ram, and so the operating system can treat all dram equally. It will go into a general pool of pages (usually 4k) for allocation to processes, file system buffers, etc. On a system with special memory of various sorts (nvram, high-speed on-chip memory, and a few others), the operating system's general policies aren't usually correct.
How this is presented to the user will depend on choices made while porting the OS to this system.
One could modify the OS to be explicitly aware of this special memory, and provide special system calls to allocate it to to user land processes. However, this could be quite a bit of work unless the embedded linux being used has at least some support for this sort of thing.
The approach I'd probably take would be to avoid modifying the kernel itself, and instead write a device driver for the internal memory. A driver of this sort would typically provide some sort of mmap interface to allow user processes to get simple address-based access to the internal memory.
Here are answers to some of your concrete questions.
How much of a role does the OS play in distinguishing these different memories?
If your system has taken the device driver approach described above, then the OS probably knows only about external memory, or perhaps just enough about the internal memories to initialize them properly although that would likely be in the device driver too, if at all possible. If the OS knows more explicitly about the on-chip memory, then it will definitely contain any needed initialization code, as well as some sort of scheme to provide access to the user processes.
How is it decided which buffers would be in external memory, which ones would be allocated in internal SRAM, etc.
It seems unlikely to me that the operating system would try to automate such choices. Instead, I suspect that either the OS or a device driver would provide a generic interface to provide access to the on-chip memory, and leave it up to your user code to decide what to do with it.
When one calls calloc/malloc on such a system/code, the pointer returned is from which memory: internal or external?
Almost certainly, malloc and friends will return pointers into the general off-chip memory. In the driver-based approach suggested above, you'd use mmap to gain access to the on-chip memory. If you needed to do finer-grained allocation than that, you'd need to write your own allocator, or find one that can be given an explicit region of memory to work in.
Can a user make buffers to be allocated in the memories of his choice (internal/external)?
If by buffers you mean the regions returned from the standard malloc calls, probably not. But, if you mean "can a user program somehow get a pointer to the on-chip memory", then the answer is almost certainly yes, but the mechanism will depend on choices made when porting linux to this system.
In ARM architectures, there is another memory called "tightly coupled memory" (TCM). What is that and how can user enable and use it? Can I declare buffers in this memory?
I don't know what this is. If I had to guess, I'd assume it's just another form of on-chip ram, but since it has a different name, perhaps I'm wrong.
Do I need to see the memory map (if any) of the hardware board to understand about all these different physical memories present in a typical hardware board?
If the OS and/or device drivers have provided some sort of abstract access to these memory regions, then you won't need to know explicitly about the address map. This knowledge is, however, needed to implement this access in either the kernel or a device driver.
I hope this helps somewhat.

Why would a region of memory be marked non-cached?

In an embedded application, we have a table describing the various address ranges that are valid on out target board. This table is used to setup the MMU.
The RAM address range is marked as cacheable, but other regions are marked at not cacheable. Why is that?
This is done so that the processor does not use stale values due to caching.
When you access (regular) cached RAM, the processor can "remember" the value that you accessed. The next time you look at that same memory location, the processor will return the value it remembers without looking in RAM. This is caching.
If the content of the location can change without the processor knowing as could be the case if you have a memory mapped device (an FPGA returning some data packets for example), the processor could return the value is "remembered" from last time, which would be wrong.
To avoid this problem, you mark that address space as non-cacheable. This insures the processor does not try to remember the value.
Any memory region used for DMA or other hardware interactions should not be cached.
If a memory region is accessed by both hardware and software simultaneously (EX: hardware configuration register or scatter-gather list for DMA), this region must be defined as non-cached. For actual DMA, the memory buffer can be defined as cached, and in most cases, it is advisable for the buffer to be cached to allow the application level speedy access to that buffer. It's the driver's responsibility to flush/invalidate cache before passing the buffer to DMA or the application.
Small update, above must is not correct in case we have a specialized hardware, i.e. Cache Coherency Interconnect (CCI) which will synchronize access of various hardware blocks to memory.
In consideration of cache coherence. Assume there is a location X in memory that can be accessed by devices using DMA, so cpu would not be awared when the devices write a new value into X. If cpu have cached the content of X before, when cpu need the value it will get it from cache. In this case, cpu gets stale value. Similarly, devices using DMA might read a stale value either.
From wiki Direct_memory_access
Perhaps it's used for memory-mapped I/O?
Modern controllers can use L2 cache for DMA, meaning they preserve the coherency of the cached memory region used for DMA accesses. This is also termed as "snoop-able memory transactions" performed by the controller (via DMA).
Some areas like Flash can be read in one cycle, so do not need to be cached.

Resources