uma_zalloc and uma_zfree are thread safe in freebsd kernel space? - memory

I have two threads which in both of them I use uma_zalloc() and uma_zfree() functions for the one variable of uma_zone_t.
I want to know uma_zone_t is thread safe or not?

Yes, the use of the zone pointer is thread safe with respect to concurrent allocations and frees. The zone structure manages a pool of memory and the UMA interfaces do not require external synchronization.

Related

Is JavoNet a threadsafe library, and more imporantlty, does it allow usage of all threads?

Is javonet threadsafe? I couldn't find any documentation one way or the other. Even if it is threadsafe, is there some sort of "mutex" that's preventing full usages of all threads?
When I tried to run javonet in parallel, it did work, but the CPU usage did not significantly increase above the sequential load (ie on a 10CPU system, the CPU usage hovered around 20% for parallel load, whcih was only merely double the sequential CPU load of 10%); however, if I ran 10 version of the exact same sequential code (that used javonet), I achieved 100% CPU usage....so it "feels" like javonet must have some built-in mutexes that's preventing full parallel usage.
Javonet is thread safe. You just need to follow standard practices for writing multi-threaded applications and Javonet will take care of executing your code properly.
Javonet creates new corresponding .NET thread for calling Java threads. Also the other way for callbacks, events and delegates if called from other thread Javonet will create the corresponding thread on Java side. Once the calling thread completes, Javonet will close the thread on the other side.
If the corresponding thread already exists, Javonet will rejoin to valid thread.
Javonet does use internal mutexes / readwritelocks while accessing objects instances, some caching collections and types what depending on your Java code might affect the parallelization capabilities.

What does it mean by saying all threads in a process share same data?

In my textbook it says:
All threads within a process have access to the same data (share)
But each thread has its own stack, it means local variable is not shared. So what kind of data threads can share.
update:
I found each thread can share global variable, it made me confused, what I learn global global variable is static stack, it shouldn't be shared by that each thread has its own stack.
Firstly, global and static variables are shared.
Secondly, memory from malloc that you can reach via a pointer in a global variable, (or via a pointer that...) is shared.
Thirdly, memory in one thread's stack that you can reach via a pointer in a global variable, (or via a pointer that...) is shared.
Really, all of it is shared, all the time, but not all of it is reachable. Thread A can access thread B's stack, but it won't have a pointer to do so through unless thread B does something like assign the address of something in its stack to a global (don't do that) or you're doing something where you examine the details of the threads and work your way into their stacks (if you're doing that you're way more knowledgeable enough about the mechanisms of the pthreads implementation than I am*, so I won't tell you not to do that, but as a rule it wouldn't be something to do lightly).
Mostly you only have to worry about global and static, and can consider anything dealing only in locals to be thread-safe, but that gets blown away once you can reach the latter from the former.
*Actually, I know little on that, and am mostly basing this on knowledge of threading in other languages. Really, it's a general threading thing rather than a pthreads specific one, with the exception that some other languages won't even let you fall into the trap of referencing stack memory from global.
I found each thread can share global variable, it made me confused,
what I learn global global variable is static stack, it shouldn't be
shared by that each thread has its own stack.
You're thinking too much about the stack, there are other memory regions such as data or bss. Objects with static storage (such as global variables and those declared with the modifier static) are shared between all threads.
Also, if you try hard, all threads can acces everything, there's nothing special about a different "stack". If a thread manages to get a pointer to a location on another "stack" it can freely read it, modify it etc.
The main point of all this is that threads don't just share variables. They are simply in the same virtual memory space.
#cnicutar is right about the static storage. Actually there is a good discussion about this in another question. Despite the title of that question, the answers there(especially first two) do answer your question well and I don't think I can do better.
As has been said, each thread has its own stack. So data on this stack is said to be thread safe, since only it owns it.
All threads within a process have access to the same data (share)
As it implies, multiple threads have access to same data. This should ring some mini alarm bells since you have to start thinking about thread synchronization (critical sections, mutexes etc) to resources which are shared.
Resources allocated on the heap say through the new operator are shared as all threads have access to the same heap.
I dont think static data will be allocated for each thread.
It will be instantiated only once and will be accessible to all the threads having the declaration of that static data in their execute() method..

Is it possible to use google tcmalloc to get per thread memory usage

Like the title says I'm interested if I can see per thread memory usage on programs compiled with -ltcmalloc. AFAIK with regular malloc memory is linked to process not to thread, but I'm not sure about tcmalloc.
TcMalloc has some per-thread memory caches. But they are just a proxy to a shared heap (to reduce the congestion). All the memory in tcmalloc comes from a single shared pool.
Alive (allocated) memory may be passed freely from one thread to the other,
so it's not easy to say which thread uses it.
You could monitor which thread allocated the used memory, but you would need either completely separated memory pools (not very elastic) or some per-allocation memory overhead. Neither of those is present in tcmalloc...
There is no such thing as per-thread memory usage. Memory is a process resource.

stack management in CLR

I understand the basic concept of stack and heap but great if any1 can solve following confusions:
Is there a single stack for entire application process or for each thread starting in a project a new stack is created?
Is there a single Heap for entire application process or for each thread starting in a project a new stack is created?
If Stack are created for each thread, then how process manage sequential flow of threads (and hence stacks)
There is a separate stack for every thread. This is true not only for CLR, and not only for Windows, but pretty much for every OS or platform out there.
There is single heap for every Application Domain. A single process may run several app domains at once. A single app domain may run several threads.
To be more precise, there are usually two heaps per domain: one regular and one for really large objects (like, say, a 64K array).
I don't understand what you mean by "sequential flow of threads".
One stack for each thread, all threads share the same heaps.
There is no 'sequential flow' of threads. A thread is an operating system object that stores a copy of the processor state. The processor state includes the register values. One of them is ESP, the stack pointer. Another really important one is EIP, the instruction pointer. When the operating system switches between threads, it stores the processor state in the current thread object and reloads the state from the thread object for the thread that was selected to run next. The processor now simply continues executing where it left off previously.
Getting a thread started is perhaps now easy to understand as well. The operating system allocates a megabyte of memory for the stack. And initializes the ESP register value to point to that memory. And sets the value of the EIP register to the address of the method where the thread should start executing. The value of the ThreadStart delegate in C#.
Each thread must have it's own stack, that's where local variables and parameters are held, and the return addresses of the previous functions.

Multithreaded Heap Management

In C/C++ I can allocate memory in one thread and delete it in another thread. Yet whenever one requests memory from the heap, the heap allocator needs to walk the heap to find a suitably sized free area. How can two threads access the same heap efficiently without corrupting the heap? (Is this done by locking the heap?)
In general, you do not need to worry about the thread-safety of your memory allocator. All standard memory allocators -- that is, those shipped with MacOS, Windows, Linux, etc. -- are thread-safe. Locks are a standard way of providing thread-safety, though it is possible to write a memory allocator that only uses atomic operations rather than locks.
Now it is an entirely different question whether those memory allocators scale; that is, is their performance independent of the number of threads performing memory operations? In most cases, the answer is no; they either slow down or can consume a lot more memory. The first scalable allocator in both dimensions (speed and space) is Hoard (which I wrote); the Mac OS X allocator is inspired by it -- and cites it in the documentation -- but Hoard is faster. There are others, including Google's tcmalloc.
Yes an "ordinary" heap implementation supporting multithreaded code will necessarily include some sort of locking to ensure correct operation. Under fairly extreme conditions (a lot of heap activity) this can become a bottleneck; more specialized heaps (generally providing some sort of thread-local heap) are available which can help in this situation. I've used Intel TBB's "scalable allocator" to good effect. tcmalloc and jemalloc are other examples of mallocs implemented with multithreaded scaling in mind.
Some timing comparisons comparisons between single threaded and multithread-aware mallocs here.
This is an Operating Systems question, so the answer is going to depend on the OS.
On Windows, each process gets its own heap. That means multiple threads in the same process are (by default) sharing a heap. Thus the OS has to thread-synchronize its allocation and deallocation calls to prevent heap corruption. If you don't like the idea of the possible contention that may ensue, you can get around it by using the Heap* routines. You can even overload malloc (in C) and new (in C++) to call them.
I found this link.
Basically, the heap can be divided into arenas. When requesting memory, each arena is checked in turn to see whether it is locked. This means that different threads can access different parts of the heap at the same time safely. Frees are a bit more complicated because each free must be freed from the arena that it was allocated from. I imagine a good implementation will get different threads to default to different arenas to try to minimize contention.
Yes, normally access to the heap has to be locked. Any time you have a shared resource, that resource needs to be protected; memory is a resource.
This will depend heavily on your platform/OS, but I believe this is generally OK on major sytems. C/C++ do not define threads, so by default I believe the answer is "heap is not protected", that you must have some sort of multithreaded protection for your heap access.
However, at least with linux and gcc, I believe that enabling -pthread will give you this protection automatically...
Additionally, here is another related question:
C++ new operator thread safety in linux and gcc 4

Resources