I'm running a dask LocalCluster with 44 workers on a 44 core linux machine.
The program runs totally fine, however there is a constant stream of:
distributed.worker - WARNING - Unmanaged memory use is high... warnings.
I have set the env variable MALLOC_TRIM_THRESHOLD_ to 0.
Nothing fails. The documentation makes me believe it is a false positive warning.
Is there a way of suppressing the constant stream of this one particular warning, without raising the silence_logs parameter?
Related
mac osx (catalina)
gfortran 9.3.0 from homebrew
htop 2.2.0 from homebrew
I have the following program in memtest.f90 which I compile with gfortran memtest.f90 -o test and then call with ./test
program main
implicit none
integer, parameter :: n=100000000
real, allocatable :: values(:)
print *, "no memory used yet, press enter"
read(*,*)
allocate(values(n))
values = 0.0
print *, "used a lot of memory, press enter"
read(*,*)
deallocate(values)
print *, "why is the memory still there in htop"
read(*,*)
end program main
I am expecting the memory used by the program to drop after calling the deallocate statement, however, as indicated by htop it continues to hover at about 382 MB (see image below)
is this a memory leak and if so how do I properly release the memory or am I just doing something wrong in looking at the memory consumed by the program?
The program will typically not return the memory to the operating system below some threshold. It may also take some time to be freed. This is not a Fortran issue, but rather a system issue.
I did not mark it as a duplicate of this Will malloc implementations return free-ed memory back to the system? because it is quite indirect, and deserves some commenting, but the issue is there. Fortran compilers typically call the malloc provided by the operating system's or accompanying C-compiler's C library.
Prequisites
POSIX.1 2008 specifies the setrlimit() and getrlimit() functions. Various constants are provided for the resource argument, some of which are reproduced below for easier understaning of my question.
The following resources are defined:
(...)
RLIMIT_DATA
This is the maximum size of a data segment of the process, in bytes. If this limit is exceeded, the malloc() function shall fail with errno set to [ENOMEM].
(...)
RLIMIT_STACK
This is the maximum size of the initial thread's stack, in bytes. The implementation does not automatically grow the stack beyond this limit. If this limit is exceeded, SIGSEGV shall be generated for the thread. If the thread is blocking SIGSEGV, or the process is ignoring or catching SIGSEGV and has not made arrangements to use an alternate stack, the disposition of SIGSEGV shall be set to SIG_DFL before it is generated.
RLIMIT_AS
This is the maximum size of total available memory of the process, in bytes. If this limit is exceeded, the malloc() and mmap() functions shall fail with errno set to [ENOMEM]. In addition, the automatic stack growth fails with the effects outlined above.
Furthermore, POSIX.1 2008 defines data segment like this:
3.125 Data Segment
Memory associated with a process, that can contain dynamically allocated data.
I understand that the RLMIT_DATA resource was traditionally used to denote the maximum amount of memory that can be assigned to a process with the brk() function. Recent editions of POSIX.1 do no longer specify this function and many operating systems (e.g. Mac OS X) do not support this function as a system call. Instead it is emulated with a variant of mmap() which is not part of POSIX.1 2008.
Questions
I am a little bit confused about the semantic and use of the RLIMIT_DATA resource. Here are the concrete questions I have:
Can the stack be part of the data segment according to this specification?
The standard says about RLIMIT_DATA: “If this limit is exceeded, the malloc() function shall fail with errno set to [ENOMEM].” Does this mean that memory allocated with malloc() must be part of the data segment?
On Linux, memory allocated with mmap() does not count towards the data segment. Only memory allocated with brk() or sbrk() is part of the data segment. Recent versions of the glibc use a malloc() implementation that allocates all its memory with mmap(). The value of RLIMIT_DATA thus has no effect on the amount of memory you can allocate with this implementation of malloc().
Is this a violation of POSIX.1 2008?
Do other platforms exhibit similar behavior?
The standard says about RLIMIT_AS: "If this limit is exceeded, the malloc() and mmap() functions shall fail with errno set to [ENOMEM]." As the failure of mmap() is not specified for RLIMIT_DATA, I conclude that memory obtained from mmap() does not count towards the data segment.
Is this assumption true? Does this only apply to non-POSIX variants of mmap()?
FreeBSD also shares the problem of malloc(3) being implemented using mmap(2) in the default malloc implementation. I ran into this when porting a product from FreeBSD 6 to 7, where the switch happened. We switched the default limit for each process from RLIMIT_DATA=512M to RLIMIT_VMEM=512M, i.e. limit the virtual memory allocation to 512MB.
As for whether this violates POSIX, I don't know. My gut feeling is that lots of things violate POSIX and a 100% POSIX compliant system is as rare as a strictly-confirming C compiler.
EDIT: heh, and now I see that FreeBSD's name RLIMIT_VMEM is non-standard; they define RLIMIT_AS as RLIMIT_VMEM for POSIX compatibility.
Is it true that SPSS INSERT procedure treats warnings the same way as errors? I am running the INSERT procedure with ERROR = STOP keyword. The execution of the procedure stops after the first warning.
I would say it is a strange behaviour. For example, R source function stops the execution of the script only on errors, not on warnings.
It isn't always obvious which output indicates a warning and which an error, since both warnings and errors appear in a Warnings block. For example, if you run this code using the employee data.sav file shipped with Statistics,
missing values jobcat (1 thru 10).
desc variables=jobcat.
it will generate a Warning block that says
Warnings
No statistics are computed because there are no valid cases.
But if you retrieve the error level, which requires programmability, you will see that this is a level 3 error. Warnings are assigned error level 2. Warnings do not stop a command from running while higher levels do.
Levels 3, 4, and 5 are considered errors, although level 5, Catastrophic error, would be hard to report, since it means that the SPSS Processor has crashed.
I have some constraints which z3 takes a long time to solve. I am aware of the "-st" command-line flag that prints statistics but at the very end, and the TRACE facility for printing out internal data structure values. Is is there a way to get diagnostic information from within z3 (eg. to monitor memory usage continuously) as it is running (external tools like ps are not always convenient and do not always serve the purpose), when it is being used from the command-line? Thanks.
You can use the option -v:100, it sets the verbosity level to 100. It may not still display the memory usage as often as you want.
Another option is to add the following line of code in appropriate places.
timeit tt(get_verbosity_level() >= 3, "report");
It will display memory usage if the verbosity level is >= 3.
For example, a good place is in the beginning of the method lbool context::bounded_search() at src/smt/smt_context.cpp. This method is executed after each restart.
I am running SBCL 1.0.51 on a Linux (Fedora 15) 32-bit system (kernel 3.6.5) with 1GB Ram and 256MB swap space.
I fire up sbcl --dynamic-space-size 125 and start calling a function that makes ~10000 http-requests (using drakma) to an http (couchDB) server and I just format to the standard-output the results of an operation on the returned data.
After each call I do a (sb-ext:gc :full t) and then (room). The results are not growing. No matter how many times I run the function, (room) reports the same used space (with some ups and downs, but around the same average which does not grow).
BUT: After every time I call the function, top reports that the VIRT and RES amount of the sbcl process keeps growing ,even beyond the 125MB space I told sbcl to ask for itself. So I have the following questions:
Why top -reported memory keeps growing, while (room) says it does not? The only thing I can think of is some leakage through ffi. I am not directly calling out with ffi but maybe some drakma dep does and forgets to free its C garbage. Anyway I dont know if this could even be an explanation. Could it be something else? Any insights?
Why isnt --dynamic-space-size honoured?