Whether fcntl record lock can work in multithreads - pthreads

I tried fcntl record lock work in multithreads on linux.
But it seems fcntl record lock doesn't work?
Can fcntl record lock work in multithreads?

fcntl() locking is process-based. It provides for any thread of one process to set a lock on a file or part of one that prevents all threads of all other processes from acquiring conflicting locks. But it does not discriminate among different threads of the same process, so it cannot be used to support one thread excluding other threads of the same process.

Related

How resilient is modern Rails to the antipattern "thread + fork"?

I think this is a popular antipattern that happens either standalone, for example activeJob local task with async, or coming from controllers, because then the strategy of the server must be taken into account.
My question is, what cautions should one take in the code when forking inside a thread (think inside of a ActiveJob task) and then even threading it?
The main worries I have seen online are:
Needs to lose and reopen the database connections after the fork. It seems that nowadays activeRecord takes care of it, doesn't it?
Access to the common Logger could be complicated. Somehow it seems to work.
Concurrent was expected to be problematic too but current versions are patched to detect that a fork has happened and threads are dead. Still it seems that one needs to make sure of doing, at the end of the forked process, a fine shutdown of any Rails::Concurrent pool that could have active or pending jobs. I think that it is enough
ActiveJob::Base.queue_adapter.shutdown
but perhaps it could miss some tasks that have not started or tasks under other Concurrent queue. In fact I think it already happens if one uses Concurrent::Future in a controller managed by the puma webserver. Generically I try to insert
Concurrent::global_io_executor.shutdown
Concurrent::global_io_executor.wait_for_termination
Extra problems I have found are resource-related: the postgres server is not ready to manage so many connections by default. Perhaps it could be sensible to reduce the size of the connection pool before the fork. And the inotify watcher gem also exhausts resource, when launched in development. Production is fine in both cases.
TL;DR; - I'm against doing it but many of us do it anyway and ignore the fact that it's unsafe... things break too rarely.
It is a simple fact that calling fork in a multi-threaded process may cause the new child to crash / deadlock / spin and may also cause other (harder to isolate) bugs.
This has nothing to do with Ruby, this is related to the locking mechanisms that safeguard critical sections and core process functionality such opening/closing files, allocating memory and any user created mutex / spinlock, etc'.
Why is it risky?
When calling fork the new process inherits all the state of the previous process but only the thread that called fork (all other threads do not exist in the new process).
This means that if any of the other threads was inside a critical section (i.e., allocating memory, opening a file, etc'), that critical section would remain locked for the lifetime of the new process, possibly causing deadlocks or unexpected errors.
Why do we ignore it?
In practical terms, the risk of something seriously breaking is often very low and most developers hadn't both encounter the issue and recognized its cause. Open files can be manually (if not automatically) closed, which leaves us mostly with the question of critical sections.
We can often reset our own critical sections which leaves mostly the system's critical sections...
The system's core critical sections that can be effected by fork are not that many. The main one is the memory allocator which can hardly ever break. Often the malloc implementation has multiple "arenas", each with its own critical section and it would be a long-shot to hit the system's underlying page allocation (i.e., mmap).
So is it safe?
No. Things still break, they just break rarely and when they do it isn't always obvious. Also, a parent process can sometime catch some of these errors and retry / recuperate and there are other ways to handle the risks.
Should I do it?
I wouldn't recommend to do it, but it depends. If you can handle an error, sure, go ahead. If not, that's a no.
Anyway, it's usually much better to use IPC to forward a message to a background process so that process perform any required fork / task.
The pattern can occur naturally when a Rails controller is combined with a webserver. The situation is different depending if the webserver is threaded, forked or evented, but the final conclusion is the same; that it is safe.
Fork + fork and thread + fork should not present problems of multiple access to the database or multiple execution of the same code, as only the current thread is active in the children.
Event + fork could be a source of troubles if the event machine is still active in the forked thread. Fortunately most designs generate a separate thread for the control of the event loop.

Can I make sure a specific child thread acquires a lock if I have its id? (Using pthreads library in C++)

I want to release the lock I've acquired within the main process and hand it over to a specific thread. Is there a way I can do this? I'm using the pthreads library
I want to release the lock I've acquired within the main process and hand it over to a specific thread.
This implies that multiple threads want to acquire this lock.
If it matters which one of these threads actually gets it, your design is very likely wrong.
Is there a way I can do this?
No.

Interprocess SQLite Thread Safety (on iOS)

I'm trying to determine if my sqlite access to a database is thread-safe on iOS. I'm writing a non App Store app (or possibly a launch daemon), so Apple's approval isn't an issue. The database in question is the built-in sms.db, so for sure the OS is also accessing this database for reading and writing. I only want to be able to safely read it.
I've read this about reading from multiple processes with sqlite:
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
I understand that thread-safety can be compiled out of sqlite, and that sqlite3_threadsafe() can be used to test for this. Running this on iOS 5.0.1
int safe = sqlite3_threadsafe();
yields a result of 2. According to this, that means mutex locking is available. But, that doesn't necessarily mean it's in use.
I'm not entirely clear on whether thread-safety is dynamically enabled on a per connection, per database, or global basis.
I have also read this. It looks like sqlite3_config() can be used to enable safe multi-threading, but of course, I have no control, or visibility into how the OS itself may have used this call (do I?). If I were to make that call again in my app, would it make it safe to read the database, or would it only deconflict concurrent access for multiple threads in my app that used the same sqlite3 database handle?
Anyway, my question is ...
can I safely read this database that's also accessed by iOS, and if so, how?
I've never used SQLite, but I've spent a decent amount of time reading its docs because I plan on using it in the future (and the docs are interesting). I'd say that thread safety is independent of whether multiple processes can access the same database file at once. SQLite, regardless of what threading mode it is in, will lock the database file, so that multiple processes can read from the database at once but only one can write.
Thread safety only affects how your process can use SQLite. Without any thread safety, you can only call SQLite functions from one thread. But it should still, say, take an EXCLUSIVE lock before writing, so that other processes can't corrupt the database file. Thread safety just protects data in your process's memory from getting corrupted if you use multiple threads. So I don't think you ever need to worry about what another process (in this case iOS) is doing with an SQLite database.
Edit: To clarify, any time you write to the database, including a plain INSERT/UPDATE/DELETE, it will automatically take an EXCLUSIVE lock, write to the database, then release the lock. (And it actually takes a SHARED lock, then a RESERVED lock, then a PENDING lock, then an EXCLUSIVE lock before writing.) By default, if the database is already locked (say from another process), then SQLite will return SQLITE_BUSY without waiting. You can call sqlite3_busy_timeout() to tell it to wait longer.
I don't think any of this is news to you, but a few thoughts:
In terms of enabling multi-threading (either serialized or multi-threaded), the general counsel is that one can invoke sqlite3_config() (but you may have to do a shutdown first as suggested in the docs or as discussed on SO here) to enable the sort of multi-threading you want. That may be of diminished usefulness here, though, where you have no control over what sort of access iOS is requesting of sqlite and/or this database.
Thus, I would have thought that, from an academic perspective, it would not be safe to read this system database (because as you say, you have no assurance of what the OS is doing). But I wouldn't be surprised if iOS is opening the database using whatever the default mode is, so from a more pragmatic perspective, you might be fine.
Clearly, for most users concerned about multi-threaded access within a single app, the best counsel would be to bypass the sqlite3_config() silliness and just simply ensure coordinated access through your own GCD serial queue (i.e., have a dedicated queue through which all database interactions go through, gracefully eliminating the multi-thread issue altogether). Sadly, that's not an option here because you're trying to coordinate database interaction with iOS itself.

Need guarantee that signals will be deilivered by offending thread

I'm working on a project and cannot find any documentation that verifies signal/thread behavior for the following case.
I need to ensure that signals generated in a thread will be delivered by the offending thread (namely, SIGSEGV). I've been told that POSIX doesn't ensure this behavior and that pthreads, for example, can generate signals in pthread 1 yet have the signal delivered in pthread 2. Therefore, I'm planning on using clone(2) to have finer control of signal/thread behavior, but still cannot find documentation in the man pages that ensures signals will be delivered by the offending thread.
Hardcore systems programmers: any documentation or insights would be very much appreciated!
The POSIX chapter on Signal Generation and Delivery states that:
At the time of generation, a
determination shall be made whether
the signal has been generated for the
process or for a specific thread
within the process. Signals which are
generated by some action attributable
to a particular thread, such as a
hardware fault, shall be generated for
the thread that caused the signal to
be generated. Signals that are
generated in association with a
process ID or process group ID or an
asynchronous event, such as terminal
activity, shall be generated for the
process.
A synchronous SIGSEGV caused by an incorrect memory access in a thread is clearly such a signal "...generated by some action attributable to a particular thread...", so they are guaranteed to be generated for the offending thread (which means handled or ignored by that thread, as appropriate).
I'm pretty sure this works, even if it isn't guaranteed. I base this observation on the fact that I used to work at a company where we routinely handled SIGSEGV in multithreaded programs and printed a stack trace to a log file (based on currently location). We did this on a variety of platforms--windows, linux, tru64 unix, aix, hpux, sunos ... and maybe one or two others that I can't recall. This (usually!) works because the location of SIGSEGV is still on the current stack (the signal handling mechanism just adds a few call frames over top of it).
It's only fair to mention that there's very little that you should do in a signal handler because there aren't many functions that are async signal safe. We ignored this and mostly got away with it, except if I recall on sunos (or aix) where we would get burned--the process would occasionally (and seemingly randomly) hard exit inside the signal handler.
I believe the recommended approach is to NOT handle SIGSEGV, and let the process exit with a core dump so you can analyze later in a debugger.

When to use pthread_cancel and not pthread_kill?

When does one use pthread_cancel and not pthread_kill?
I would use neither of those two but that's just personal preference.
Of the two, pthread_cancel is the safest for terminating a thread since the thread is only supposed to be affected when it has set its cancelability state to true using pthread_setcancelstate().
In other words, it shouldn't disappear while holding resources in a way that might cause deadlock. The pthread_kill() call sends a signal to the specific thread, and this is a way to affect a thread asynchronously for reasons other than cancelling it.
Most of my threads tends to be in loops doing work anyway and periodically checking flags to see if they should exit. That's mostly because I was raised in a world when pthread_kill() was dangerous and pthread_cancel() didn't exist.
I subscribe to the theory that each thread should totally control its own resources, including its execution lifetime. I've always found that to be the best way to avoid deadlock. To that end, I simply use mutexes for communication between threads (I've rarely found a need for true asynchronous communication) and a flag variable for termination.
You can not "kill" a thread with pthread_kill(). If you try to send SIGTERM or SIGKILL to a thread with pthread_kill(), it will terminate the entire process.
I subscribe to the theory that the PROGRAMMER and not the THREAD (nor the API designers) should totally control its own software in all aspects, including which threads cancel which.
I once worked in a firm where we developed a server that used a pool of worker threads and one special master thread that had the responsibility to create, suspend, resume and terminate the worker threads at any time it wanted. Of course the threads used some sort of synchronization, but it was of our design and not some API-enforced dogmas. The system worked very well and efficiently!
This was under Windows. Then I tried to port it for Linux and I stumbled at the pthreads' stupid "theories" about how wrong it is to suspend another thread etc. So I had to abandon pthreads and directly use the native Linux system calls (clone()) to implement the threads for our server.

Resources