According to this WWDC iOS Memory Deep Dive talk https://developer.apple.com/videos/play/wwdc2018/416, memory footprint equals to dirty and swapped size combined. However, when I use vmmap -summary [mempgraph] to inspect my memgraphs, many times they don't add up. In this particular case vmmap
, memory footprint = 118.5M while the dirty size is 123.3M. How can footprint be smaller than the dirty size?
In the same WWDC talk, it's mentioned that heap --sortBySize [memgraph] can be used to inspect heap allocations, and I see from my memgraph that the heap size is about 5M All zones: 110206 nodes (55357344 bytes) , which is much smaller than the MALLOC regions in the vmmap result. Doesn't malloc allocate spaces in the heap?
Related
Can I say 'maximum resident set size of a process' equals 'required RAM size at least for the process'?
Am I right? If not, why?
Of sorts, yes. However, the OS may over allocate memory to the process,
or under allocate (and therefore, use swap space). At any rate, it is a good approximation.
See Peak memory usage of a linux/unix process
It seems to me they are quite similar. So what's the relation between slab and buddy system?
A slab is a collection of objects of the same size. It avoids fragmentation by allocating a fairly large block of memory and dividing it into equal-sized pieces. The number of pieces is typically much larger than two, say 128 or so.
There are two ways you can use slabs. First, you could have a slab just for one size that you allocate very frequently. For example, a kernel might have an inode slab. But you could also have a number of slabs in progressive sizes, like a 128-byte slab, a 192-byte slab, a 256-byte slab, and so on. You can then allocate an object of any size from the next slab size up.
Note that in neither case does a slab re-use memory for an object of a different size unless the entire slab is freed back to a global "large block" allocator.
The buddy system is an unrelated method where each object has a "buddy" object which it is coalesced with when it is freed. Blocks are divided in half when smaller blocks are needed. Note that in the buddy system, blocks are divided and coalesced into larger blocks as the primary means of allocation and returning for re-use. This is very different from how slabs work.
Or to put it more simply:
Buddy system: Various sized blocks are divided when allocated and coalesced when freed to efficiently divide a big block into smaller blocks of various sizes as needed.
Slab: Very large blocks are allocated and divided once into equal-sized blocks. No other dividing or coalescing takes place and freed blocks are just held in a list to be assigned to subsequent allocations.
The Linux kernel's core allocator is a flexible buddy system allocator. This allocator provide the slabs for the various slab allcoators.
In general slab allocator is a list of slabs with fixed size suited to place predefined size elements. As all objects in the pool of the same size there is no fragmentation.
Buddy allocator divides memory in chunks which sizes a doubled. For example if min chunk is 1k, the next will be 2K, then 4K etc. So if we will request to allocate 100b, then the chunk with size 1k will be chosen. What leads to fragmentation but allows to allocate arbitrary size objects (so it's well suited for user memory allocations where exact object sized could be of any size).
See also:
https://en.wikipedia.org/wiki/Slab_allocation
https://en.wikipedia.org/wiki/Buddy_memory_allocation
Also worse check this presentations: http://events.linuxfoundation.org/images/stories/pdf/klf2012_kim.pdf
Slides from page 22 reveal the summary of differencies.
In Chrome task manager, there is a column called GPU memory. In GPU-z, I can see the memory size information of the video card. I suppose it is the video memory. Is it the same as GPU memory?
Yes that is the same as the GPU Memory.
The only exception to this is on some lower-end computers use a technique called shared graphics memory in which the integrated graphics card uses some of the RAM as video memory. In the case of your non-integrated graphics card, this would not be the case.
My team is running into an issue where the amount of texture memory allocated via the glTexImage2D is high enough that it crashes the app ( at about 400 MB for iPhone 5). We're taking steps to minimize the texture allocation ( via compression, using fewer bits/channel and doing procedural shaders for VFX etc).
Since the app crashed on glTexImage2D, I felt like, it's running out of texture memory (as against virtual memory). Is there any documentation/guideline on the recommended texture memory usage by an app (not just optimize your texture memory) .
AFAIK on iOS devices ( and many Android devices) there's no dedicated VRAM and our app process is still well within the virtual memory limit. Is this some how related to the size of physical RAM ? My searches so far has resulted only in info on max texture size and tricks for optimizing texture usage and such. Any information is appreciated.
Suppose a small computer system has 4 MB of main memory. The system manages it in fixed sized frames. A frames table maintains the status of each frame in memory. How large (how many byte) should a frame be? You have a choice of one of the following: 1K, 5K, or 10K bytes. Which of these choices will minimize the total space wasted by processes due to fragmentation and frames table storage?
Assume the following: On the average, 10 processes will reside in memory. The average amount of wasted space will be 1/2 frame for each process.
The frames table must have one entry for each frame. Each entry requires 10 bytes.
Here is my answer:
1K would minimize the fragmentation, as known small size leads to big tables but smaller wasted space.
10 processes ~ 1/2 frame wasted on each.
Am I on the right track?
Yes, you are. I agree with you that on a system such as this, the smallest size makes the most sense. However, for example, if you take the situation of x86-64, where the options are 4kb, 2MB, 1GB. Considering modern memory sizes of approximation 4GB, obviously 1GB makes no sense, but because most programs nowadays contain quite a bit of compiled code, or in the case of interpreted and VM languages, all of the code of the VM, 2 MB pages make the most sense. In other words, to determine these things, you have to think about the average memory usage of a program in this system, the number of programs, and most importantly, the ration of average fragmentation to page table size. Because while a small memory size like that benefits from the low fragmentation, 4kb pages on 4GB of memory is a very large page table. Very large.