Any method to limit the CPU usage of pycaffe compiled with openblas? - machine-learning

The pycaffe is compiled with openblas.
My pycaffe model occupies more than 20 CPU cores while training, any method to limit that?

Try setting environment variable
OPENBLAS_NUM_THREADS=10
before running pycaffe to restrict openblas to only 10 threads.

Related

How can I use Hugepage memory from kernel space?

i need to be able to allocate 2MB or 4MB sized pages of memory in a kernel module.
In Linux Kernel for allocation of continuous memory you can use function:
__get_free_pages(flags, page_rate);
where flags is usual flags and page_rate defines number of allocated pages where: number of pages = 2 ^ page_rate. You can use this function as proxy between the Kernel and your calling code.
Another approach is allocate huge page if it is possible.

Docker service Limits and Reservations

Docker v1.12 service comes with four flags for setting the resource limits on a service.
--limit-cpu value Limit CPUs (default 0.000)
--limit-memory value Limit Memory (default 0 B)
--reserve-cpu value Reserve CPUs (default 0.000)
--reserve-memory value Reserve Memory (default 0 B)
What is the difference between limit and reserve in this context?
What does the cpu value mean in here? Does this mean number of cores? cpu share? What is the unit?
Reserve holds those resources on the host so they are always available for the container. Think dedicated resources.
Limit prevents the binary inside the container from using more than that. Think of controlling runaway processes in container.
Based on my limited testing with stress, --limit-cpu is percent of a core, though if there are multiple threads, it'll spread those out across core's and seems to attempt to keep the total near what you'd expect.
In the below pic, from left to right, was --limit-cpu 4, then 2.5, then 2, then 1. All of those tests had stress set to CPU of 4 (worker threads).

kbmmemtable EOutOfMemory error after LoadFromDataset

I am using Delphi 7 Enterprise under Windows 7 64 bit.
My computer had 16 GB of RAM.
I try to use kbmMemTable 7.70.00 Professional Edition (http://news.components4developers.com/products_kbmMemTable.html) .
My table has 150,000 records, but when I try to copy the data from Dataset to the kbmMemTable it only copies 29000 records and I get this error: EOutOfMemory
I saw this message:
https://groups.yahoo.com/neo/groups/memtable/conversations/topics/5769,
but it didn't solve my problem.
An out of memory can happen of various reasons:
Your application uses too much memory in general. A 32 bit application typically runs out of memory when it has allocated 1.4GB using FastMM memory manager. Other memory managers may have worse or better ranges.
Memory fragementation. There may not be enough space in memory for a single large allocation that is requested. kbmMemTable will attempt to allocate roughly 200000 x 4 bytes as one single large allocation. As its own largest single allocation. That shouldnt be a problem.
Too many small allocations leading to the above memory fragmentation. kbmMemTable will allocate from 1 to n blocks of memory per record depending on the setting of the Performance property .
If Performance is set to fast, then 1 block will be allocated (unless blobs fields exists, in which case an additional allocation will be made per not null blob field).
If Performance is balanced or small, then each string field will allocate another block of memory per record.
best regards
Kim/C4D

Maximum memory allocation on openCL CPU

I have read that there's a limit to the maximum memory allocation to around 60% of device memory, and these can be changed by modifying the GPU_MAX_HEAP_SIZE and GPU_MAX_ALLOC_SIZE environment variables for GPU.
I am wonder if the AMD SDK has something similar for the CPU if I want to raise the limit of memory allocation?
For my current configuration, it returns the following:
CL_DEVICE_MAX_MEM_ALLOC_SIZE = 2973.37MB
CL_DEVI_CEGLOBAL_MEM_SIZE = 11893.5MB
Thanks.
I was able to change this on my system. I don't know if this method was possible when you originally asked the question.
set the environment variable 'CPU_MAX_ALLOC_PERCENT' to the percentage of total memory you want to be able to allocate for a single global buffer. I have 8GB system memory, and after setting CPU_MAX_ALLOC_PERCENT to 80, clinfo reports the following:
Max memory allocation: 6871207116
Success! 6.399GB
You can also use GPU_MAX_ALLOC_PERCENT in the same way for your GPU devices.

Measure top memory consumption (linux program)

How can I measure the top (the maximum) memory usage of some programm?
It do a lot of malloc/free, and run rather fast, so I can't see the max memory in top.
I want smth like time utility:
$ time ./program
real xx sec
user xx sec
sys xx sec
and
$ mem_report ./program
max memory used xx mb
shared mem xx mb
The time call is your shell. If you call /usr/bin/time, the program, you will get some knowledge of resident memory usage. Note however that it may not count memory-mapped files, shared memory and other details which you may need.
If you are on linux, you can wrap your program in a script that polls:
# for your current process
/proc/self/statm
# or a process you know the pid of
/proc/{pid}/statm
and writes out the results - you can aggregate them afterwards.

Resources