I'm attempting to do an exercise from "Expert C Programming" where the point is to see how much memory a program can allocate. It hinges on malloc returning NULL when it cannot allocate anymore.
#include <stdio.h>
#include <stdlib.h>
int main() {
int totalMB = 0;
int oneMeg = 1<<20;
while (malloc(oneMeg)) {
++totalMB;
}
printf("Allocated %d Mb total \n", totalMB);
return 0;
}
Rather than printing the total, I get a kernel panic after allocating ~8GB on my 16GB Macbook Pro.
Kernel panic log:
Anonymous UUID: 0B87CC9D-2495-4639-EA18-6F1F8696029F
Tue Dec 13 23:09:12 2016
*** Panic Report ***
panic(cpu 0 caller 0xffffff800c51f5a4): "zalloc: zone map exhausted while allocating from zone VM map entries, likely due to memory leak in zone VM map entries (6178859600 total bytes, 77235745 elements allocated)"#/Library/Caches/com.apple.xbs/Sources/xnu/xnu-3248.50.21/osfmk/kern/zalloc.c:2628
Backtrace (CPU 0), Frame : Return Address
0xffffff91f89bb960 : 0xffffff800c4dab12
0xffffff91f89bb9e0 : 0xffffff800c51f5a4
0xffffff91f89bbb10 : 0xffffff800c5614e0
0xffffff91f89bbb30 : 0xffffff800c5550e2
0xffffff91f89bbba0 : 0xffffff800c554960
0xffffff91f89bbd90 : 0xffffff800c55f493
0xffffff91f89bbea0 : 0xffffff800c4d17cb
0xffffff91f89bbf10 : 0xffffff800c5b8dca
0xffffff91f89bbfb0 : 0xffffff800c5ecc86
BSD process name corresponding to current thread: a.out
Mac OS version:
15F34
I understand that this can easily be fixed by the doctor's cliche of "It hurts when you do that? Then don't do that" but I want to understand why malloc isn't working as expected.
OS X 10.11.5
For the definitive answer to that question, you can look at the source code, which you'll find here:
zalloc.c source in XNU
In that source file find the function zalloc_internal(). This is the function that gives the kernel panic.
In the function you'll find a "for (;;) {" loop, which basically tries to allocate the memory you're requesting in the specified zone. If there isn't enough space, it immediately tries again. If that fails it does a zone_gc() (garbage collect) to try to reclaim memory. If that also fails, it simply kernel panics - effectively halting the computer.
If you want to understand how zalloc.c works, look up zone-based memory allocators.
Your program is making the kernel run out of space in the zone called "VM map entries", which is a predefined zone allocated at boot. You could probably get the result you are expecting from your program, without a kernel panic, if you allocated more than 1 MB at a time.
In essence it is not really a problem for the kernel to allocate you several gigabytes of memory. However, allocating thousands of smaller allocations summing up to those gigabytes is much harder.
Related
How do I find the amount of memory installed on my QNX Neutrino system?
uname -a doesn't show it
top only shows how much memory is available
I've looked at pidin syspage without success
pidin mem shows all the used memory in gory detail
pidin info will show the amount of memory installed, as shown below.
pidin info
CPU:X86 Release:6.4.1 FreeMem:836Mb/1015Mb BootTime:Jun 04 14:01:55 UTC 2014
The amount of free RAM is the size of "/proc" !
In your own program, you can write something like this :
#include <sys/stat.h>
struct stat buf;
if (stat("/proc", &buf) != -1) {
printf("Mem free = %d\n", buf.st_size);
}
--
Hope this help
Emmanuel
showmem -S will show the amount of RAM memory installed as shown below,
showmem -S
System RAM: 1936M ( 2030043136)
Total Used: 401M ( 420642376)
Used Private: 317M ( 332529404)
Used Shared: 79M ( 83333120)
Other: 4667K ( 4779852) (includes IFS and reserved RAM)
I'm trying to obtain how much free memory I have on the device. To do this I call the cuda function cuMemGetInfo from a fortran code, but it returns negative values for the free amount of memory, so there's clearly something wrong.
Does anyone know how I can do that?
Thanks
EDIT:
Sorry, in fact my question was not very clear. I'm using OpenACC in Fortran and I call the C++ cuda function cudaMemGetInfo. Finally I could fix the code, the problem was effectively the kind of variables that I was using. Switching to size_ fixed everything. This is the interface in fortran that I'm using:
interface
subroutine get_dev_mem(total,free) bind(C,name="get_dev_mem")
use iso_c_binding
integer(kind=c_size_t)::total,free
end subroutine get_dev_mem
end interface
and this the cuda code
#include <cuda.h>
#include <cuda_runtime.h>
extern "C" {
void get_dev_mem(size_t& total, size_t& free)
{
cuMemGetInfo(&free, &total);
}
}
There's one last question: I pushed an array on the gpu and I checked its size using cuMemGetInfo, then I computed it's size counting the number of bytes, but I don't have the same answer, why? In the first case it is 3052mb large, in the latter 3051mb. This difference of 1mb could be the size of the array descriptor? Here there's the code that I used:
integer, parameter:: long = selected_int_kind(12)
integer(kind=c_size_t) :: total, free1,free2
real(8), dimension(:),allocatable::a
integer(kind=long)::N, eight, four
allocate(a(four*N))
!some OpenACC stuff in order to init the gpu
call get_dev_mem(total,free1)
!$acc data copy(a)
call get_dev_mem(total,free2)
print *,"size a in the gpu = ",(free1-free2)/1024/1024, " mb"
print *,"size a in theory = ", (eight*four*N)/1024/1024, " mb"
!$acc end data
deallocate(a)
Right, so, like commenters have suggested, we're not sure exactly what you're running, but filling in the missing details by guessing, here's a shot:
Most CUDA API calls return a status code (or error code if you will); this is true both in C/C++ and in Fortran, as we can see in the Portland Group's CUDA Fortran Manual:
Most of the runtime API routines are integer functions that return an error code; they return a value of zero if the call was successful, and a nonzero value if there was an error. To interpret the error codes, refer to “Error Handling,” on page 48.
This is the case for cudaMemGetInfo() specifically:
integer function cudaMemGetInfo( free, total )
integer(kind=cuda_count_kind) :: free, total
The two integers for free and total are cuda_count_kind, which if I am not mistaken are effectively unsigned... anyway, I would guess that what you're getting is an error code. Have a look at the Error Handling section on page 48 of the manual.
what is the maximum amount of memory for a single process in UNIX and Linux and windows? how to calculate that? How much user address space and kernel address space for 4 GB of RAM?
How much user address space and kernel address space for 4 GB of RAM?
The address space of a process is divided into two parts,
User space: On standard 32 bit x86_64 architecture,the maximum addressable memory is 4GB, out of which addresses from 0x00000000 to 0xbfffffff = (3GB) meant for code, data segments. This region can be addressed when user process executing either in user or kernel mode.
Kernel space: Similarly, addresses from 0xc0000000 to 0xffffffff = (1GB) are meant for virtual address space of the kernel and can only addressed when the process executes in kernel mode.
This particular address space split on x86 is determined by the value of PAGE_OFFSET. Referring to Linux 3.11.1v page_32_types.h and page_64_types.h, page offset is defined as below,
#define __PAGE_OFFSET _AC(CONFIG_PAGE_OFFSET, UL)
Where Kconfig defines a default value of default 0xC0000000 also with other address split options available.
Similarly for 64 bit,
#define __PAGE_OFFSET _AC(0xffff880000000000, UL).
On 64 bit architecture 3G/1G split doesn't hold anymore due to huge address space. As per the source latest Linux version has given above offset as offset.
When I see my 64 bit x86_64 architecture, a 32 bit process can have entire 4GB of user address space and kernel will hold address range above 4GB. Interestingly on modern 64 bit x86_64 CPU's not all address lines are enabled(or the address bus is not large enough) to provide us 2^64 = 16 exabytes of virtual address space. Perhaps AMD64/x86 architectures has 48/42 lower bits enabled respectively resulting to 2^48 = 256TB / 2^42= 4TB of address space. Now this definitely improves performance with large amount of RAM, at the same time question arises how it is efficiently managed with the OS limitations.
In Linux there's a way to find out the limit of address space you can have.
Using the rlimit structure.
struct rlimit {
rlim_t cur; //current limit
rlim_t max; //ceiling for cur.
}
rlim_t is a unsigned long type.
and you can have something like:
#include <stdio.h>
#include <stdlib.h>
#include <sys/resource.h>
//Bytes To GigaBytes
static inline unsigned long btogb(unsigned long bytes) {
return bytes / (1024 * 1024 * 1024);
}
//Bytes To ExaBytes
static inline double btoeb(double bytes) {
return bytes / (1024.00 * 1024.00 * 1024.00 * 1024.00 * 1024.00 * 1024.00);
}
int main() {
printf("\n");
struct rlimit rlim_addr_space;
rlim_t addr_space;
/*
* Here we call to getrlimit(), with RLIMIT_AS (Address Space) and
* a pointer to our instance of rlimit struct.
*/
int retval = getrlimit(RLIMIT_AS, &rlim_addr_space);
// Get limit returns 0 if succeded, let's check that.
if(!retval) {
addr_space = rlim_addr_space.rlim_cur;
fprintf(stdout, "Current address_space: %lu Bytes, or %lu GB, or %f EB\n", addr_space, btogb(addr_space), btoeb((double)addr_space));
} else {
fprintf(stderr, "Coundn\'t get address space current limit.");
return 1;
}
return 0;
}
I ran this on my computer and... prrrrrrrrrrrrrrrrr tsk!
Output: Current address_space: 18446744073709551615 Bytes, or 17179869183 GB, or 16.000000 EB
I have 16 ExaBytes of max address space available on my Linux x86_64.
Here's getrlimit()'s definition
it also lists the other constants you can pass to getrlimits() and introduces getrlimit()s sister setrlimit(). There is when the max member of rlimit becomes really important, you should always check you don't exceed this value so the kernel don't punch your face, drink your coffee and steal your papers.
PD: please excuse my sorry excuse of a drum roll ^_^
On Linux systems, see man ulimit
(UPDATED)
It says:
The ulimit builtin is used to set the resource usage limits of the
shell and any processes spawned by it. If a new limit value is
omitted, the current value of the limit of the resource is printed.
ulimit -a prints out all current values with switch options, other switches, e.g. ulimit -n prints out no. of max. open files.
Unfortunatelly, "max memory size" tells "unlimited", which means that it is not limited by system administrator.
You can view the memory size by
cat /proc/meminfo
Which results something like:
MemTotal: 4048744 kB
MemFree: 465504 kB
Buffers: 316192 kB
Cached: 1306740 kB
SwapCached: 508 kB
Active: 1744884 kB
(...)
So, if ulimit says "unlimited", the MemFree is all yours. Almost.
Don't forget that malloc() (and new operator, which calls malloc()) is a STDLIB function, so if you call malloc(100) 10 times, there will be lot of "slack", follow link to learn why.
Have some issues with passing large amount of data (3 MB) from uboot to linux kernel 2.6.35.3 on imx50 ARM board. This data is required in kernel device driver probe function and then it should be released. First uboot load data from flash to RAM, then pass physical address for linux kernel using bootargs. In kernel I try to reserve certain amount of memory using reserve_resource() in arch/arm/kernel/setup.c file:
--- a/arch/arm/kernel/setup.c Tue Jul 17 11:22:39 2012 +0300
+++ b/arch/arm/kernel/setup.c Fri Jul 20 14:17:16 2012 +0300
struct resource my_mem_res = {
.name = "My_Region",
.start = 0x77c00000,
.end = 0x77ffffff,
.flags = IORESOURCE_MEM | IORESOURCE_BUSY,
};
## -477,6 +479,10 ##
kernel_code.end = virt_to_phys(_etext - 1);
kernel_data.start = virt_to_phys(_data);
kernel_data.end = virt_to_phys(_end - 1);
+ my_mem_res.start = mi->bank[i].start + mi->bank[i].size - 0x400000;
+ my_mem_res.end = mi->bank[i].start + mi->bank[i].size - 1;
for (i = 0; i < mi->nr_banks; i++) {
if (mi->bank[i].size == 0)
## -496,6 +502,8 ##
if (kernel_data.start >= res->start &&
kernel_data.end <= res->end)
request_resource(res, &kernel_data);
+
+ request_resource(res, &my_mem_res);
}
if (mdesc->video_start) {
By this I'm trying to tell kernel that this memory area it reserved and this data should not be modified by kernel.
70000000-77ffffff : System RAM
70027000-7056ffff : Kernel text
70588000-7062094f : Kernel data
77c00000-77ffffff : My_Region
In driver ioremap(0x77c00000, AREA_SIZE) is used to get kernel memory address. But when I dump content of memory, there is only zeros. If boot kernel with mem=120M (total 128MB RAM is avaliable), then my data is above kernel system ram region, then I get data I expect.
So, my questions:
Why I get zeros and how do I pass large amount of binary data from uboot to linux kernel?
You could use a custom ATAG to either pass the data block or to pass the address & length of the data. Note that the "A" in ATAG stands for ARM, so this solution is not portable to other architectures. An ATAG is preferable to a command-line bootarg IMO because you do not want the user to muck with physical memory addresses. Also the Linux kernel will process the ATAG list before the MMU (i.e. virtual memory) is enabled.
In U-Boot, look at lib_arm/armlinux.c or arch/arm/lib/bootm.c for routines that build the ARM tag list. Write your own routine for your new tag(s), and then invoke it in do_bootm_linux().
In the Linux kernel ATAGs are processed in arch/arm/kernel/setup.c, when virtual memory has not yet been enabled. If you just pass an address & length values from U-Boot, then the pointer & length can be assigned to global variables that are exported,
void *my_data;
unsigned int my_dlen;
EXPORT_SYMBOL(my_data);
EXPORT_SYMBOL(my_dlen);
and then the driver can retrieve it.
extern void *my_data;
extern unsigned int my_dlen;
request_mem_region(my_data, my_dlen, DRV_NAME);
md_map = ioremap(my_data, my_dlen);
I've used similar code to probe for SRAM in U-Boot, then pass the starting address & number of KBytes found to the kernel in a custom ATAG. A kernel driver obtains these values, and if they are nonzero and have sane values, creates a block device out of the SRAM. The major difference from your situation is that the SRAM is in a completely different physical address range from the SDRAM.
NOTE
An ATAG is built by U-Boot for the physical memory that the kernel can use, so this is really where you need to define and exclude your reserved RAM. It's probably too late to do that in the kernel.
I am trying to find the memory map of an array or some memory allocated from malloc() using mmap() but it is showing invalid argument.
#include<stdio.h>
#include<sys/mman.h>
#include<stdlib.h>
int main()
{
int *var1=NULL;
size_t size=0;
size = 1000*sizeof(int);
var1 = (int*)malloc(size);
int i=0;
for(i=0;i<999;i++)
{
var1[i] = 1;
}
printf("%p\n",var1);
void *addr=NULL;
addr = mmap((void *)var1, size, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0); //to create memory map of var1
err(1,NULL); //to print error
return 0;
}
Error:
a.out: Invalid argument
Please help me.
Thank you in advance.
Proximate cause: mmap fails because you asked it do create a new memory mapping, you asked for the mapping to be placed at a specific address (var1's address), that address is already occupied (by the heap from which malloc got its memory), and you told the operating system it was not allowed to choose an alternate address in case var1 was not a suitable address (MAP_FIXED).
Analysis: What are you trying to do here? What does "find the memory map of an array" mean? Do you want to have your array of integers located in heap memory (returned by malloc()) or in an anonymous memory mapping created by mmap()? By the way, unless you fork() (create a child process) there is little functional difference: both are areas of memory that are private to your process. But they are not the same thing and you can't manipulate the heap with mmap() nor can you manage mapped memory with malloc().