How to simulate "openmp private" in pthread - pthreads

I am using pthread in order to parallelize some code. First, I parallelized it with openmp. It was fairly easy and straightforward. Because I only made a variable private in order to avoid race condition. I want to do the same in my pthread code. What can I do?

Depending on your code/purpose you can either use pthread mutex to serialize access to some shared resource/value so only one thread is modifying it at any time:
- pthread_mutex_create/destroy
- pthread_mutex_lock/unlock
+ make resource/value itself volatile to avoid compiler time optimization
or you can use thread local values:
- pthread_key_create/delete
- pthread_setspecific
- pthread_getspecific
though local variable in pthread start_routine might do same thing for you.

Related

Get lua state inside lua

I would like to get the lua state from inside lua so I can pass it too an external program that cannot be hooked up using ffi or dll. I just need a pointer to it and the ability to share it(shared memory across program boundaries.
That or I can create a lua state in my program and then pass that so I would simply need to set the lua state to it inside lua(and it would have to work with shared memory).
I've thought about sharing data using json but ideally I would like to directly access objects.
Lua is pretty good about avoiding heap allocation and global pointers to allocated memory. lua_newstate takes an allocator function as a parameter. The provided function will be used to allocate/deallocate all memory associated with the lua_State object. Including the pointer returned by lua_newstate.
So hypothetically, you could provide an allocator function that allocates/deallocates interprocess shared memory. And then, you can just pass the lua_State to some other process and access it.
First, you clearly cannot do this "from inside lua"; that kind of low-level thing just ain't happening. You cannot access the lua_State object from within Lua. You must be in control of the lua_State creation process for that to be a possibility. So we're talking about C (equivalent) code here, not in-Lua code.
Now, you can expose a C function to Lua which returns a light userdata that just so happens to be the exact lua_State* in question. But Lua can't really do much with light userdata other than pass it to other C function APIs.
Second, while the Lua system provides a guarantee that it will only allocate memory through the allocator, the system does not provide a guarantee that what you're trying to do will work. It is entirely possible that the Lua implementation does use process global memory, so long as it does it in such a way that different threads can access that global memory without breaking threading guarantees.
Obviously, you can inspect the Lua implementation to see if it does anything of the kind. But my point is that the guarantees are that each independent lua_State will be thread-isolated from each other and that each lua_State will only allocate memory through the given allocator. There is no guarantee that Lua's implementation doesn't have some global storage that it uses for some purpose.
So simply sharing the memory allocated by the Lua state may not be enough.
Also, even if this works, the two processes cannot access the same lua_State object at the same time, just like two threads in the same process cannot access the lua_State at the same time.
The lua state is not designed to leave the program / thread it is executing in.
Doing a query on a running lua_state could result in a crash, because it is only notionally consistent when a lua call returns, or a C api function is called. During execution, some un-locked modifications could cause uninitialized memory access, or ininite loops due to lists being inconsistent.

Why isn't #synchronized(self) in Objective-C discouraged like lock(this) is in C#?

My understanding of the #synchronized(obj) { ... } directive in Objective-C is that it's essentially the same as the lock(obj) { ... } construct in C#, i.e...
Both create a kind of local, anonymous mutex tied to the specified object
Any thread entering the #synchronized/lock block must acquire that mutex prior to executing any code in the block (and will wait to acquire the mutex before proceeding if it's already reserved).
In C#, using lock(this) is strongly discouraged (see Why is lock(this) {...} bad?) for an example), basically because you then have no control over who locks your mutex or when (since someone else could be using your object as a mutex).
I assumed that concept this would also apply to #synchronized(self) in Objective-C, but I was told by a senior developer on my team that #synchronized functions differently than lock, and the concerns in the above SO post don't apply.
I guess my question is - what am I misunderstanding about #synchronized? Why is #synchronized(self) safe when lock(this) is not?
Thanks.
There are different attitudes to things. I had a quick look at the C# answer that you linked to, and if you did the same things with #synchronized, you would be crucified. You should use #synchronized for the absolutely minimum amount of time, and you should absolutely not call anything else that could use #synchronized at the same time. Obey this rule, and you are fine. I suppose in C# you would be just as fine, but in C# they assume that others do stupid things.
There's a simple strategy to avoid deadlocks (if you follow it): Have a set of "level 0" locks: While you hold a "level 0" lock you must not try to acquire any other locks. Then you have a set of "level 1" locks: While you hold a "level 1" lock you can acquire a "level 0" lock, but nothing else. And so on: If you hold a "level n" lock, you may acquire locks at a lower level, but none at the same or higher level. Voila: No deadlocks possible.
A #synchronized that is not at level 0 would be very rare, and you should think very hard about it. There are other synchronisation mechanisms, like serial queues, that work just fine as well.

pthread mutex: get state

I was looking through some code that provides a C/C++ wrapper for a pthread mutex. The code keeps a shadow variable for signaled/not signaled condition. The code also ignores return values from functions like pthread_mutex_lock and pthread_mutex_trylock, so the shadow variable may not accurately reflect the state of the mutex (ignoring the minor race condition).
Does pthread provide a way to query a mutex for its state? A quick read of the pthread API does not appear to offer one. I also don't see anything interesting that operates on pthread_mutexattr_t.
Or should one use trylock, rely upon EBUSY, and give up ownership if acquired?
Thanks in advance.
There is no such function because there would be no point. If you queried the state of a mutex without trying to acquire it, as pthread_mutex_trylock() does, then the result you get could be invalidated immediately by another thread changing that mutex's state.

Is there a portable way to statically initialise a recursive mutex?

According to POSIX, I can statically initialise a mutex this way:
pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
However, what if I want the mutex to be recursive? Mutexes are non-recursive be default and there's no way to supply mutex attributes to the static initialisation.
It seems there is no portable way to do this. A workaround may be initialise the mutex dynamically when it is first used. To prevent race conditions while doing the initialisation another non-recursive statically initialised mutex can be used.
Try :
pthread_mutex_t mutex = PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP;

Can I use pthread mutexes in the destructor function for thread-specific data?

I'm allocating my pthread thread-specific data from a fixed-size global pool that's controlled by a mutex. (The code in question is not permitted to allocate memory dynamically; all the memory it's allowed to use is provided by the caller as a single buffer. pthreads might allocate memory, I couldn't say, but this doesn't mean that my code is allowed to.)
This is easy to handle when creating the data, because the function can check the result of pthread_getspecific: if it returns NULL, the global pool's mutex can be taken there and then, the pool entry acquired, and the value set using pthread_setspecific.
When the thread is destroyed, the destructor function (as per pthread_key_create) is called, but the pthreads manual is a bit vague about any restrictions that might be in place.
(I can't impose any requirements on the thread code, such as needing it to call a destructor manually before it exits. So, I could leave the data allocated, and maybe treat the pool as some kind of cache, reusing entries on an LRU basis once it becomes full -- and this is probably the approach I'd take on Windows when using the native API -- but it would be neatest to have the per-thread data correctly freed when each thread is destroyed.)
Can I just take the mutex in the destructor? There's no problem with thread destruction being delayed a bit, should some other thread have the mutex taken at that point. But is this guaranteed to work? My worry is that the thread may "no longer exist" at that point. I use quotes, because of course it certainly exists if it's still running code! -- but will it exist enough to permit a mutex to be acquired? Is this documented anywhere?
The pthread_key_create() rationale seems to justify doing whatever you want from a destructor, provided you keep signal handlers from calling pthread_exit():
There is no notion of a destructor-safe function. If an application does not call pthread_exit() from a signal handler, or if it blocks any signal whose handler may call pthread_exit() while calling async-unsafe functions, all functions may be safely called from destructors.
Do note, however, that this section is informative, not normative.
The thread's existence or non-existence will most likely not affect the mutex in the least, unless the mutex is error-checking. Even then, the kernel is still scheduling whatever thread your destructor is being run on, so there should definitely be enough thread to go around.

Resources