Heap / Stack and multiple processes

Heap / Stack and multiple processes - memory

Say I have two process p1,p2 runnning as a part of my application.
Say p1 is running initially executing function f1() and then f1() calls f2().With the invocation of f2() process p2 starts excuting
What I want to confirm is it that :-
1)Do we have seperate stack for different process?
2)Do we have seperate heap for different process? or do different process share same heap?
3)As we know that for a 32 bit OS do for every process the size of virtual memory is 4GB .So is it that for every process which has 4GB as virtual memory this 4GB is partitioned into heap,stack,text,data
Thanks.

1) Yes, each process gets its own stack.
2) Yes, each process gets its own heap.
3) I don't think you get the whole 4GB. Some of it is reserved for kernel stuff.

The virtual memory for a process will
be different from other process.
Every process will get 4GB of virtual
address space ( in 32 bit windows
machine) and out of which you can use
2GB of user space ( remaining is for
kernel). For stack, heap, static data storage and even loading the DLLs. (This is 3GB if you use large address space)
Every process will get separate heap,
stack independent of other process.

There are other limitations to consider in Java too, such as only being able to address arrays using Integer.MAX_VALUE at most. This limits you to about 2GB in a lot of areas relating to memory.

Related

Does it make sense to run multinode Elasticsearch cluster on a single host?

What do I get by running multiple nodes on a single host? I am not getting availability, because if the host is down, the whole cluster goes with it. Does it make sense regarding performance? Doesn't one instance of ES take as many resources from the host as it needs?

Generally no, but if you have machines with ridiculous amounts of CPU and memory, you might want that to properly utilize the available resources. Avoiding big heaps with Elasticsearch is a good thing generally since garbage collection on bigger heaps can become a problem and in any case above 32 GB you lose the benefit of pointer compression. Mostly you should not need big heaps with ES. Most of the memory that ES uses is through memory mapped files, which relies on the OS cache. So just because you aren't assigning memory to the heap doesn't mean it is not being used: more memory available for caching means you'll be able to handle bigger shards or more shards.
So if you run more nodes, that advantage goes away and you waste memory on redundant heaps, and you'll have nodes competing for resources. Mostly, you should base these decisions on actual memory, cache, and cpu usage of course.

It depends on your host and how you configure your nodes.
For example, Elastic recommends allocating up to 32GB of RAM (because of how Java compresses pointers) to elasticsearch and have another 32GB for the operating system (mostly for disk caching).
Assuming you have more than 64GB of ram on your host, let's say 128, it makes sense to have two nodes running on the same machine, having both configured to 32GB ram each and leaving another 64 for the operating system.

Demand paging terminologies clarification

I have been reading about demand paging and there are a few terminologies I don't understand.
What is a frame? I read that it is a block of physical memory which can at least fit in a page ( so a frame can fit one or more pages? ). But does this physical memory refer to the RAM or the disk storage?.
Which one of these is true:
The virtual address space ( which is 4 GiB in 32 bit systems ) is allocated for one application at a time, so that every application has 4 GiB virtual address to access to, and each time we switch application, the OS reconfigures the virtual address space to map to other other applications. Or the virtual address space is allocated to several processes? If so, how much virtual memory does each application get and what happen when it wants more virtual memory?
Do we have a page table for each application running, or a common page table for all applications?
Where does virtual memory fragmentation come from ?
I hope someone can clarify me.

A frame is a block of physical memory, RAM. I've not heard of frames being larger than pages, I've always understood them synonymous. However, a CPU may allow for frames/pages of different sizes to coexist simultaneously (e.g. large pages of 4MB/2MB/1GB size and regular 4KB pages on x86).
Whether there's a single address space shared by multiple applications or each has its own address space depends on the OS. Windows 3.xx and 9x/Me had a shared address space. Windows NT/2000/XP/etc had individual, per-app address spaces. Not all of the address space is available to an application / applications. A portion is reserved for the OS (kernel, drivers, their data).
Should be obvious now. One note though... Even with individual address spaces a portion of memory can still be made available in several different address spaces and it may be done by having a common page table in the respective processes. Also, it's very typical for the kernel portion of the address space to be managed by several page tables common to all processes.
Whether the address space is virtual or not, it can become fragmented. You may want to allocate a contiguous (in terms of the virtual addresses) buffer of, say, 8KB, but you may only have two non-adjacent 4KB regions available.

Is the Nand2Tetris Hack computer's RAM a good model for how RAM is structured on x86 machines?

The following is the structure of RAM for the entire Hack Computer in Nand2Tetris:
Putting aside virtual memory, is this a good simplified model for how the entire RAM is set up on x86 computers? Is RAM really just made of clusters of memory regions each with their own stack, heap and instruction memory, stacked on top of each other in RAM?
Basically, is RAM just a collection of independent and separate memory regions of each process/program running? Or, does RAM consist of variables scattered randomly from different programs?

Hugely over-simplified, processes on a machine with Virtual Memory could all think they have a memory map similar to that of the Hack Virtual Machine (note: Virtual Memory != Virtual Machine).
However, individual chunks of each process' memory map might be mapped to some arbitrary physical memory, shuffled off to a swap file, not allocated until actually needed, shared with other processes, and more. And those chunks that are in RAM might be anywhere (and might move).
You may find this article to be a good starting point to understanding Virtual Memory: https://en.wikipedia.org/wiki/Virtual_memory

Since modern computer uses virtual memory, why do we still encounter "out of memory" issue?

I am learning the concept of virtual memory, but this question has been confusing me for a while. Since most modern computers use virtual memory, when a program is in execution, the os is supposed to page data in and out between RAM and disk. But why do we still encounter "out of memory" issue? Could you please correct me if I misunderstood the concept? I really appreciate your explanation.
PS: For example, I was analyzing a large amount of data (>100G) output from simulation on a computing cluster, and read in the data to an C array. Very often the system crashed and complained a memory error.

First: Modern computer do indeed use virtual memory, however there is no magic here. Memory is not created out of nothing. Virtual memory schemes typically allow a portion of the mass storage sub-system (aka hard disk) to be used to hold portions of the process that are (hopefully) less frequently used.
This technique allows processes to use more memory than is available as RAM. However nothing is infinite. Eventually all RAM and Hard Drive resources will be used up and the process will get an out of memory error.
Second: It is not unheard of for operating systems to place a cap on the memory that a process may use. Hit that cap and again, the process gets an out of memory error.

Even with virtual memory the memory available is not unlimited.
Limit 1) Architectural limits. The processor and operating system will place some maximum virtual memory limit.
Limit 2) System Parameters. Many operating systems configure the maximum virtual memory size.
Limit 3) Process quotas. Many operating system have process quotas that limit the maximum virtual memory size.
Limit 4) System resources. Notably page file space.

When do memory addresses get assigned?

Consider the following CPU instruction which takes the memory at address 16777386 (decimal) and stores it in Register 1:
Move &0x010000AA, R1
Traditionally programs are translated to assembly (machine code) at compile time. (Let's ignore more complex modern systems like jitting).
However, if this address allocation is completed statically at compile time, how does the OS ensure that two processes do not use the same memory? (eg if you ran the same compiled program twice concurrently).
Question:
How, and when, does a program get its memory addresses assigned?
Virtual Memory:
I understand most (if not all) modern systems use Memory Management Units in hardware to allow for the use of virtual memory. The first few octets of an address space being used to reference which page. This would allow for memory protection if each process used different pages. However, if this is how memory protection is enforced, the original question still persists, only this time with how page numbers are assigned?
EDIT:
CPU:
One possibility is the CPU can handle memory protection by enforcing that a process id be assigned by the OS before executing memory based instructions. However, this is only speculation, and requires support in hardware by the CPU architecture, something I'm not sure RISC ISAs would be designed to do.

With virtual memory each process has separate address space, so 0x010000AA in one process will refer to different value than in another process.
Address spaces are implemented with kernel-controlled page tables that processor uses to translate virtual page addresses to physical ones. Having two processes using the same address page number is not an issue, since the processes have separate page tables and physical memory mapped can be different.
Usually executable code and global variables will be mapped statically, stack will be mapped at random address (some exploits are more difficult that way) and dynamic allocation routines will use syscalls to map more pages.

(ignoring the Unix fork) The initial state of a processes memory is set up by the executable loader. The linker defines the initial memory state and the loader creates it. That state usually includes memory to static data, executable code, writeable data, and the stack.
In most systems a process can modify the address space by adding pages (possibly removing them as well).
[Ignoring system addresses] In virtual (logical) memory systems each process has an address space starting at zero (usually the first page is not mapped). The address space is divided into pages. The operating system maps (and remaps) logical pages to physical pages.
Address 0x010000AA in one process is then a difference physical memory address in each process.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Heap / Stack and multiple processes - memory

1) Yes, each process gets its own stack. 2) Yes, each process gets its own heap. 3) I don't think you get the whole 4GB. Some of it is reserved for kernel stuff.

There are other limitations to consider in Java too, such as only being able to address arrays using Integer.MAX_VALUE at most. This limits you to about 2GB in a lot of areas relating to memory.

Related

Does it make sense to run multinode Elasticsearch cluster on a single host?

Demand paging terminologies clarification

Is the Nand2Tetris Hack computer's RAM a good model for how RAM is structured on x86 machines?

Since modern computer uses virtual memory, why do we still encounter "out of memory" issue?

When do memory addresses get assigned?

Categories

Resources