Related
I use recon_alloc:memory(allocated_types) and get info like below.
34> recon_alloc:memory(allocated_types).
[{binary_alloc,1546650440},
{driver_alloc,21504840},
{eheap_alloc,28704768840},
{ets_alloc,526938952},
{fix_alloc,145359688},
{ll_alloc,403701800},
{sl_alloc,688968},
{std_alloc,67633992},
{temp_alloc,21504840}]
The eheap_alloc is using 28G. But sum up with heap_size of all process
>lists:sum([begin {_, X}=process_info(P, heap_size), X end || P <- processes()]).
683197586
Only 683M !Any idea where is the 28G ?
You are not comparing the right values. From erlang:process_info
{heap_size, Size}
Size is the size in words of youngest heap generation of the
process. This generation currently include the stack of the process.
This information is highly implementation dependent, and may change if
the implementation change.
recon_alloc:memory(allocated_types) is in bytes by default. You can change it using set_unit. It is not the memory that is currently used but it is the memory reserved by the VM grouped into different allocators. You can use recon_alloc:memory(used) instead. More details in allocator() - Recon Library
Searching through the Erlang source code for the eheap_alloc keyword I didn't come up with much. The most relevant piece of code was this XML code from erts_alloc.xml (https://github.com/erlang/otp/blob/172e812c491680fbb175f56f7604d4098cdc9de4/erts/doc/src/erts_alloc.xml#L46):
<tag><c>eheap_alloc</c></tag>
<item>Allocator used for Erlang heap data, such as Erlang process heaps.</item>
This says that process heaps are stored in eheap_alloc but it doesn't say what else is stored in eheap_alloc. The eheap_alloc stores everything your application needs to run along with some extra memory along with some additional space, so the VM doesn't have to request more memory from the OS every time something needs to be added. There are things the VM must keep in memory that aren't associated with a specific process. For example, large binaries, even though they may used within a process, are not stored inside that processes heap. They are stored in a shared process binary heap called binary_alloc. The binary heap, along with the process heaps and some extra memory, are what make up eheap_alloc.
In your case it looks like you have a lot of memory in your binary_alloc. binary_alloc is probably using a significant portion of your eheap_alloc.
For more details on binary handling checkout these pages:
http://blog.bugsense.com/post/74179424069/erlang-binary-garbage-collection-a-love-hate
http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id65224
I am working on a course homework on sysfs virtual file system in Linux Kernel. As part of setting up sysfs virtual file system, Linux kernel organizes the physical memory in to blocks and further into sections in this directoy sys/devices/system/memory. In that directory, memory chunks will be represented as memory0, meomory1, memory2 etc..
After digging the Linux kernel, I have found out that the memory is being split into 128MB blocks and then further into sections of memory and found the code which does this in the C file here: Memory.c. In the above C file, the method memory_dev_init() has the logic for the whole memory block splitting and dividing into sections (or that's what i understood :) ). As per my professor, memory in Linux is split up into ranks and ranks contain interleaved memory addresses as shown below:
rank0: [0-512KB] [2048KB-2560KB] [4096KB-4608KB] ...
rank1: [512KB-1024KB] [2560KB-3072KB] [4608KB-5120KB] ...
rank2: [1024KB-1536KB] [3072KB-3584KB] [5120KB-...
rank3: [1536KB-2048KB] [3584KB-4096KB] ...
As part of my homework, I want to change the rank format into this so that i can get a contiguous memory blocks:
rank0: [0-512KB] [512KB-1024KB] [1024KB-1536KB]...
rank1: [1536KB-2048KB] [2048KB-2560KB] [2560KB-3072KB]...
rank2: [3072KB-3584KB] [3584KB-4096KB] [4096KB-4608KB]...
rank3: [4608KB-5120KB] ...
So I just want to know where exactly this memory interleaving is happening and the existing ranking is happening in the current Linux kernel. Could anyone please point me in the right direction?
I'm not quite sure as I don't see any practical use of the question, it is indeed a sort of academic research... and what you are trying to achieve is achievable by disabling the memory interleaving entirely. I guess after you disable interleaving you will see the proper "picture" in sysfs as well.
In other words -- no coding required, just the change of configuration.
Have a look at the memory interleave settings in BIOS. Here's a post which describe how to do this in a couple of platforms.
There has been one question here which talked about stack growth direction. To which Michael Burr had replied saying in ARM processors stack growth direction can be configured - i.e. either descending (normal behaviour) stack grows towards zero address (lower address) in memory or ascending, i.e. stack grows towards higher address in memory.
What is the direction of stack growth in most modern systems?
My question is: in ARM processors, how can I make the stack grow in ascending direction?
How do I configure the stack as ascending as by default it is descending? Any register bit set/reset, etc.
Well, the ARM processors don't maintain a stack directly-- but they do have instructions that are designed with that in mind: LDM and STM. So if you use STMDB at the start of a function and LDMIA at the end, you effectively have a full+descending stack: the assemblers I remember using allowed you to write "STMFD" and "LDMFD" as aliases. (A "full" stack is one where the stack pointer points to the latest word on the stack, as opposed to the next location to use)
So it's not something you can simply reconfigure at runtime: although if you were writing your own operating system with its own call convention, you could choose to use an ascending stack. Similarly, you could also choose not to use R13 as the stack pointer- that's just part of the calling convention too. This choice effectively gets embedded into the implementation of every function that uses the stack.
You have the __user_initial_stackheap() function which helps you change the SP using Stack-Start,Stack-End & heap relocation using Heap-Start,Heap-End. This function can be used during the time of the initialization since the ARM would use this to redirect Stack and Heap.
Also, you have option to use a Single-region or two-memory model[depending on your requirement]. I have used this API when I was writing UseCases which were using ARM926EJ-S.
This document was of help during my development and might be helpful to you as well.
Hope this helps.
-hjsblogger
Hmmm thumb/thumb2 might limit you to push/pop, and with thumb2 only ARMs out there I dont know that we can generically say you can go both ways. Traditional arm instructions, yes you can ldmia or ldmdb (increment after or decrement before) and stmdb and stmia. How do you make a C compiler for example climb up in addresses instead of down autmatically? Dont know.
It is like big endian on ARM, just because you can you probably dont want to because of the headaches it brings along with it.
How is a program (e.g. C or C++) arranged in computer memory? I kind of know a little about segments, variables etc, but basically I have no solid understanding of the entire structure.
Since the in-memory structure may differ, let's assume a C++ console application on Windows.
Some pointers to what I'm after specifically:
Outline of a function, and how is it called?
Each function has a stack frame, what does that contain and how is it arranged in memory?
Function arguments and return values
Global and local variables?
const static variables?
Thread local storage..
Links to tutorial-like material and such is welcome, but please no reference-style material assuming knowledge of assembler etc.
Might this be what you are looking for:
http://en.wikipedia.org/wiki/Portable_Executable
The PE file format is the binary file structure of windows binaries (.exe, .dll etc). Basically, they are mapped into memory like that. More details are described here with an explanation how you yourself can take a look at the binary representation of loaded dlls in memory:
http://msdn.microsoft.com/en-us/magazine/cc301805.aspx
Edit:
Now I understand that you want to learn how source code relates to the binary code in the PE file. That's a huge field.
First, you have to understand the basics about computer architecture which will involve learning the general basics of assembly code. Any "Introduction to Computer Architecture" college course will do. Literature includes e.g. "John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach" or "Andrew Tanenbaum, Structured Computer Organization".
After reading this, you should understand what a stack is and its difference to the heap. What the stack-pointer and the base pointer are and what the return address is, how many registers there are etc.
Once you've understood this, it is relatively easy to put the pieces together:
A C++ object contains code and data, i.e., member variables. A class
class SimpleClass {
int m_nInteger;
double m_fDouble;
double SomeFunction() { return m_nInteger + m_fDouble; }
}
will be 4 + 8 consecutives bytes in memory. What happens when you do:
SimpleClass c1;
c1.m_nInteger = 1;
c1.m_fDouble = 5.0;
c1.SomeFunction();
First, object c1 is created on the stack, i.e., the stack pointer esp is decreased by 12 bytes to make room. Then constant "1" is written to memory address esp-12 and constant "5.0" is written to esp-8.
Then we call a function that means two things.
The computer has to load the part of the binary PE file into memory that contains function SomeFunction(). SomeFunction will only be in memory once, no matter how many instances of SimpleClass you create.
The computer has to execute function SomeFunction(). That means several things:
Calling the function also implies passing all parameters, often this is done on the stack. SomeFunction has one (!) parameter, the this pointer, i.e., the pointer to the memory address on the stack where we have just written the values "1" and "5.0"
Save the current program state, i.e., the current instruction address which is the code address that will be executed if SomeFunction returns. Calling a function means pushing the return address on the stack and setting the instruction pointer (register eip) to the address of the function SomeFunction.
Inside function SomeFunction, the old stack is saved by storing the old base pointer (ebp) on the stack (push ebp) and making the stack pointer the new base pointer (mov ebp, esp).
The actual binary code of SomeFunction is executed which will call the machine instruction that converts m_nInteger to a double and adds it to m_fDouble. m_nInteger and m_fDouble are found on the stack, at ebp - x bytes.
The result of the addition is stored in a register and the function returns. That means the stack is discarded which means the stack pointer is set back to the base pointer. The base pointer is set back (next value on the stack) and then the instruction pointer is set to the return address (again next value on the stack). Now we're back in the original state but in some register lurks the result of the SomeFunction().
I suggest, you build yourself such a simple example and step through the disassembly. In debug build the code will be easy to understand and Visual Studio displays variable names in the disassembly view. See what the registers esp, ebp and eip do, where in memory your object is allocated, where the code is etc.
What a huge question!
First you want to learn about virtual memory. Without that, nothing else will make sense. In short, C/C++ pointers are not physical memory addresses. Pointers are virtual addresses. There's a special CPU feature (the MMU, memory management unit) that transparently maps them to physical memory. Only the operating system is allowed to configure the MMU.
This provides safety (there is no C/C++ pointer value you can possibly make that points into another process's virtual address space, unless that process is intentionally sharing memory with you) and lets the OS do some really magical things that we now take for granted (like transparently swap some of a process's memory to disk, then transparently load it back when the process tries to use it).
A process's address space (a.k.a. virtual address space, a.k.a. addressable memory) contains:
a huge region of memory that's reserved for the Windows kernel, which the process isn't allowed to touch;
regions of virtual memory that are "unmapped", i.e. nothing is loaded there, there's no physical memory assigned to those addresses, and the process will crash if it tries to access them;
parts the various modules (EXE and DLL files) that have been loaded (each of these contains machine code, string constants, and other data); and
whatever other memory the process has allocated from the system.
Now typically a process lets the C Runtime Library or the Win32 libraries do most of the super-low-level memory management, which includes setting up:
a stack (for each thread), where local variables and function arguments and return values are stored; and
a heap, where memory is allocated if the process calls malloc or does new X.
For more about the stack is structured, read about calling conventions. For more about how the heap is structured, read about malloc implementations. In general the stack really is a stack, a last-in-first-out data structure, containing arguments, local variables, and the occasional temporary result, and not much more. Since it is easy for a program to write straight past the end of the stack (the common C/C++ bug after which this site is named), the system libraries typically make sure that there is an unmapped page adjacent to the stack. This makes the process crash instantly when such a bug happens, so it's much easier to debug (and the process is killed before it can do any more damage).
The heap is not really a heap in the data structure sense. It's a data structure maintained by the CRT or Win32 library that takes pages of memory from the operating system and parcels them out whenever the process requests small pieces of memory via malloc and friends. (Note that the OS does not micromanage this; a process can to a large extent manage its address space however it wants, if it doesn't like the way the CRT does it.)
A process can also request pages directly from the operating system, using an API like VirtualAlloc or MapViewOfFile.
There's more, but I'd better stop!
For understanding stack frame structure you can refer to
http://en.wikipedia.org/wiki/Call_stack
It gives you information about structure of call stack, how locals , globals , return address is stored on call stack
Another good illustration
http://www.cs.uleth.ca/~holzmann/C/system/memorylayout.pdf
It might not be the most accurate information, but MS Press provides some sample chapters of of the book Inside Microsoft® Windows® 2000, Third Edition, containing information about processes and their creation along with images of some important data structures.
I also stumbled upon this PDF that summarizes some of the above information in an nice chart.
But all the provided information is more from the OS point of view and not to much detailed about the application aspects.
Actually - you won't get far in this matter with at least a little bit of knowledge in Assembler. I'd recoomend a reversing (tutorial) site, e.g. OpenRCE.org.
I have just started learning Erlang and am trying out some Project Euler problems to get started. However, I seem to be able to do any operations on large sequences without crashing the erlang shell.
Ie.,even this:
list:seq(1,64000000).
crashes erlang, with the error:
eheap_alloc: Cannot allocate 467078560 bytes of memory (of type "heap").
Actually # of bytes varies of course.
Now half a gig is a lot of memory, but a system with 4 gigs of RAM and plenty of space for virtual memory should be able to handle it.
Is there a way to let erlang use more memory?
Your OS may have a default limit on the size of a user process. On Linux you can change this with ulimit.
You probably want to iterate over these 64000000 numbers without needing them all in memory at once. Lazy lists let you write code similar in style to the list-all-at-once code:
-module(lazy).
-export([seq/2]).
seq(M, N) when M =< N ->
fun() -> [M | seq(M+1, N)] end;
seq(_, _) ->
fun () -> [] end.
1> Ns = lazy:seq(1, 64000000).
#Fun<lazy.0.26378159>
2> hd(Ns()).
1
3> Ns2 = tl(Ns()).
#Fun<lazy.0.26378159>
4> hd(Ns2()).
2
Possibly a noob answer (I'm a Java dev), but the JVM artificially limits the amount of memory to help detect memory leaks more easily. Perhaps erlang has similar restrictions in place?
This is a feature. We do not want one processes to consume all memory. It like the fuse box in your house. For the safety of us all.
You have to know erlangs recovery model to understand way they let the process just die.
Also, both windows and linux have limits on the maximum amount of memory an image can occupy
As I recall on linux it is half a gigabyte.
The real question is why these operations aren't being done lazily ;)