In order to avoid race conditions around Core Data, different threads should use different NSManagedObjectContexts (see e.g. here).
To ensure this at runtime I would like to assert before each use of a managed object context that the same thread (or operation queue) is current as was when the managed object context was created. In effect, I would like to assert something like NSThread.currentThread == storedThread (or NSOperationQueue.currentQueue == storedQueue).
Is it appropriate to check for pointer equality between threads (or operation queues) for the stated purpose? And is there a semantic difference between comparing threads and operation queues, again for this stated purpose? (The Apple documentation for frameworks such as Core Data and UIKit usually explains situations around race conditions in terms of threads, e.g.: "Create a separate managed object context for each thread and share a single persistent store coordinator.")
UPDATE I've by now learned (from revisiting WWDC 2012 Session 2014 Core Data Best Practices) that using a NSManagedObjectContext with NSPrivateThreadConcurrencyType will probably solve this and is the way to go. However, the question would still seem to apply to the legacy choice NSConfinedConcurrencyType.
UPDATE From the WWDC 2014 session on What's new in Core Data: the launch argument -com.apple.CoreData.ConcurrencyDebug 1 is by now also supported on iOS and implies pretty much the same assertions that I was aiming at (see here).
It is my understanding that isExecuting just indicates that the thread hasn't been cancelled or finished. As a result, more than one thread will be isExecuting at a time.
Your == comparison is the one to do.
Relying on Core Data's own assertions turned out to be the way to go (see updated question). By relying on the I no longer need to compare for equality of threads (or other abstractions of "threads") at the application level.
Related
I don't understand the workings of a DispatchQueue and wanted to learn more about how they implement the foundational queueing theory requirements. I tried to inspect a queue using:
dump(DispatchQueue.global())
And this gave this output:
- <OS_dispatch_queue_global: com.apple.root.default-qos[0x10c041f00] = { xref = -2147483648, ref = -2147483648, sref = 1, target = [0x0], width = 0xfff, state = 0x0060000000000000, in-barrier}> #0
- super: OS_dispatch_queue
- super: OS_dispatch_object
- super: OS_object
- super: NSObject
I got that the label is com.apple.root.default-qos, and this is specified in the Apple docs and the class is the packaged OS_dispatch_queue_global. I understand qos is queryable on the queue itself and that makes sense as well. Width I think just means the allocated memory size.
What I don't understand are the relevances of xref, ref and sref, I think they are internal ids for the queues but I am not sure. I think they are related to fundamental queueing concepts (multithreading came to mind) but would be great to hone into this in more detail.
Is the autoreleaseFrequency hidden from this debug description? Also, what does in-barrier = 0 mean? I tried creating a custom queue and this was replaced by in-flight = 0.. so confused about that as well.
Any ideas on how these undocumented variables relate to queueing theory? I think these are undocumented internals of the API, so any educated and justified explanations would be fine!
Thanks.
Why ask this?
This is a fairly broad question about the internals of grand-central-dispatch. I had difficulty understanding the dumped output because the original WWDC '10 videos and slides for GCD are no longer public. I also didn't know about the open-source libdispatch repo (thanks Rob). That needn't be a problem, but there are no related QAs on SO explaining the topic in detail.
Why GCD?
According to the WWDC '10 GCD transcripts (Thanks Rob), the main idea behind the API was to simplify the boilerplate associated with using the #selector API for multithreading.
Benefits of GCD
Apple released a new block-based API instead of going with function pointers, to also enable type-safe code that wouldn't crash if the block had the wrong type signature. Using typedefs also made code cleaner when used in function parameters, local variables and #property declarations. Queues allow you to capture code and some state as a chunk of data that get managed, enqueued and executed automatically behind the scenes.
The same session mentions how GCD manages low-level threads under the hood. It enqueues blocks to execute on threads when they need to be executed and then releases those threads (PThreads to be precise) when they are no longer referenced. GCD manages threads automatically and doesn't expose this API - when a DispatchWorkItem is dequeued GCD creates a thread for this block to execute on.
Drawbacks of performSelector
performSelector:onThread:withObject:waitUntilDone: has numerous drawbacks that suggest poor design for the modern challenges of concurrency, waiting, synchronisation. leads to pyramids of doom when switching threads in a func. Furthermore, the NSObject.performSelector family of threading methods are inflexible and limited:
No options to optimise for concurrent, initially inactive, or synchronisation on a particular thread. Unlike GCD.
Only selectors can be dispatched on to new threads (awful).
Lots of threads for a given function leads to messy code (pyramids of doom).
No support for queueing without a limited (at the time when GCD was announced in iOS 4) NSOperation API. NSOperations are a high-level, verbose API that became more powerful after incorporating elements of dispatch (low-level API that became GCD) in iOS 4.
Lots of bugs related to unhandled invalid Selector errors (type safety).
DispatchQueue internals
I believe the xref, ref and sref are internal registers that manage reference counts for automatic reference counting. GCD calls dispatch_retain and dispatch_release in most cases when needed, so we don't need to worry about releasing a queue after all its blocks have been executed. However, there were cases when a developer could call retain and release manually when trying to ensure the queue is retained even when not directly in use. These registers allow libDispatch to crash when release is called on a queue with a positive reference count, for better error handling.
When calling a block with DispatchQueue.global().async or similar, I believe this increments the reference count of that queue (xref and ref).
The variables in the question are not documented explicitly, but from what I can tell:
xref counts the number of external references to a general DispatchQueue.
ref counts the total number of references to a general DispatchQueue.
sref counts the number of references to API serial/concurrent/runloop queues, sources and mach channels (these need to be tracked differently as they are represented using different types).
in-barrier looks like an internal state flag (DispatchWorkItemFlag) to track whether new work items submitted to a concurrent queue should be scheduled or not. Only once the barrier work item finishes, the queue returns to scheduling work items that were submitted after the barrier. in-flight means that there is no barrier in force currently.
state is also not documented explicitly but I presume points to memory where the block can access variables from the scope where the block was scheduled.
I am getting race conditions in my code when I run TSan tool. As same code has been accessed from different queues and threads at the same time that's why I can not use Serial queues or barrier as Queue will block only single queues accessing the shared resource not the other queues.
I used objc_sync_enter(object) | objc_sync_exit(object) and locks NSLock() or NSRecursiveLock() to protect shared resource but these are also not working.
While when I use #synchronized() keyword in Objective C to protect shared resource, it's working fine as expected and I am not getting race conditions in particular block of code.
So, what is an alternative to protect data in Swift as we can not use #synchronized() keyword in Swift language.
PFA screenshot for reference -
I don't understand "I can not use Serial queues or barrier as Queue will block only single queues accessing the shared resource not the other queues." Using a queue is the standard solution to this problem.
class MultiAccess {
private var _property: String = ""
private let queue = DispatchQueue(label: "MultiAccess")
var property: String {
get {
var result: String!
queue.sync {
result = self._property
}
return result
}
set {
queue.async {
self._property = newValue
}
}
}
}
With this construction, access to property is atomic and thread-safe without the caller having to do anything special. Note that this intentionally uses a single queue for the class, not a queue per-property. As a rule, you want a relatively small number of queues doing a lot of work, not a large number of queues doing a little work. If you find that you're accessing a mutable object from lots of different threads, you need to rethink your system (probably reducing the number of threads). There's no simple pattern that will make that work efficiently and safely without you having to think about your specific use case carefully.
But this construction is useful for problems where system frameworks may call you back on random threads with minimal contention. It is simple to implement and fairly easy to use correctly. If you have a more complex problem, you will have to design a solution for that problem.
Edit: I haven't thought about this answer in a long time, but Brennan's comments brought it back to my attention. Because of the bug I had in the original code, my original answer was ok, but if you fixed the bug it was bad. (If you want to see my original code that used a barrier, look in the edit history, I don't want to put it here because people will copy it.) I've changed it to use a standard serial queue rather than a concurrent queue.
Don't generate concurrent queues without careful thought about how threads will be generated. If you are going to have many simultaneous accesses, you're going to create a lot of threads, which is bad. If you're not going to have many simultaneous accesses, then you don't need a concurrent queue. GCD talks make promises about managing threads that it doesn't actually live up to. You definitely can get thread explosion (as Brennan mentions.)
I use NSPersistentContainer as a dependency in my classes. I find this approach quite useful, but there is a dilemma: I don't know in which thread my methods will be called. I found a very simple solution for this
extension NSPersistentContainer {
func getContext() -> NSManagedObjectContext {
if Thread.isMainThread {
return viewContext
} else {
return newBackgroundContext()
}
}
}
Looks wonderful but I still have a doubt is there any pitfalls? If it properly works, why on earth Core Data confuses us with its contexts?
It's OK as long as you can live with its inherent limitations, i.e.
When you're on the main queue, you always want the viewContext, never any other one.
When you're not on the main queue, you always want to create a new, independent context.
Some drawbacks that come to mind:
If you call a method that has an async completion handler, that handler might be called on a different queue. If you use this method, you might get a different context than when you made the call. Is that OK? It depends what you're doing in the handler.
Changes on one background context are not automatically available in other background contexts, so you run the risk of having multiple contexts with conflicting changes.
The method suggests a potential carelessness about which context you're using. It's important to be aware of which context you're using, because managed objects fetched on one can't be used with another. If your code just says, hey give me some context, but doesn't track the contexts properly, you increase the chance of crashes from accidentally crossing contexts.
If your non-main-queue requirements match the above, you're probably better off using the performBackgroundTask(_:) method on NSPersistentContainer. You're not adding anything to that method here.
[W]hy on earth Core Data confuses us with its contexts?
Managed object contexts are a fundamental part of how Core Data works. Keeping track of them is therefore a fundamental part of having an app that doesn't corrupt its data or crash.
I just created a singleton method, and I would like to know what the function #synchronized() does, as I use it frequently, but do not know the meaning.
It declares a critical section around the code block. In multithreaded code, #synchronized guarantees that only one thread can be executing that code in the block at any given time.
If you aren't aware of what it does, then your application probably isn't multithreaded, and you probably don't need to use it (especially if the singleton itself isn't thread-safe).
Edit: Adding some more information that wasn't in the original answer from 2011.
The #synchronized directive prevents multiple threads from entering any region of code that is protected by a #synchronized directive referring to the same object. The object passed to the #synchronized directive is the object that is used as the "lock." Two threads can be in the same protected region of code if a different object is used as the lock, and you can also guard two completely different regions of code using the same object as the lock.
Also, if you happen to pass nil as the lock object, no lock will be taken at all.
From the Apple documentation here and here:
The #synchronized directive is a
convenient way to create mutex locks
on the fly in Objective-C code. The
#synchronized directive does what any
other mutex lock would do—it prevents
different threads from acquiring the
same lock at the same time.
The documentation provides a wealth of information on this subject. It's worth taking the time to read through it, especially given that you've been using it without knowing what it's doing.
The #synchronized directive is a convenient way to create mutex locks on the fly in Objective-C code.
The #synchronized directive does what any other mutex lock would do—it prevents different threads from acquiring the same lock at the same time.
Syntax:
#synchronized(key)
{
// thread-safe code
}
Example:
-(void)AppendExisting:(NSString*)val
{
#synchronized (oldValue) {
[oldValue stringByAppendingFormat:#"-%#",val];
}
}
Now the above code is perfectly thread safe..Now Multiple threads can change the value.
The above is just an obscure example...
#synchronized block automatically handles locking and unlocking for you. #synchronize
you have an implicit lock associated with the object you are using to synchronize. Here is very informative discussion on this topic please follow How does #synchronized lock/unlock in Objective-C?
Excellent answer here:
Help understanding class method returning singleton
with further explanation of the process of creating a singleton.
#synchronized is thread safe mechanism. Piece of code written inside this function becomes the part of critical section, to which only one thread can execute at a time.
#synchronize applies the lock implicitly whereas NSLock applies it explicitly.
It only assures the thread safety, not guarantees that. What I mean is you hire an expert driver for you car, still it doesn't guarantees car wont meet an accident. However probability remains the slightest.
We are experiencing the following crash
realm::Realm::verify_thread() const (shared_realm.cpp:274)
It happens sporadic but from different flows in our code.
One of the stacktraces we find is
0x00000001003af7ec realm::Realm::verify_thread() const (shared_realm.cpp:274)
0x0000000100339d78 RLMGetObjects (RLMObjectStore.mm:83)
0x0000000100330130 +[RLMObject objectsWithPredicate:] (RLMObject.mm:150)
0x00000001000fa468 -[PrinterRepository getDefaultPrinter] (PrinterRepository.m:35)
0x00000001001faf3c -[PrintService handlePrintJobs] (PrintService.m:106)
Our code in [PrinterRepository getDefaultPrinter] is
return [[Printer objectsWithPredicate:[NSPredicate predicateWithFormat:#"isDefault == 1"]] firstObject];
Locally we aren't able to reproduce this yet, we've only seen this happen from time to time with our beta testers.
Our realm version is 0.102.1
Our iOS versions are 9.2.1, 9.3.2 & 9.3.3
Does someone has an idea of the cause of this crash?
Managed Realm objects are thread confined. You're not allowed to pass them around arbitrarily between threads. When you retrieve an object on the main thread, you're only allowed to use it there. When you want to pass it to a background thread, you will need to retrieve on the main thread an identifier, ideally the property designated as primaryKey, and pass that over to the background thread.
You get a failure like that whenever you violate against that.
See our the relevant chapter of our docs about "Passing Instances Across Threads":
Unmanaged instances of RLMObjects behave exactly as regular NSObject subclasses, and are safe to pass across threads.
Instances of RLMRealm, RLMResults, or RLMArray, or managed instances of RLMObject can only be used on the thread on which they were created, otherwise an exception is thrown*. This is one way Realm enforces transaction version isolation. Otherwise, it would be impossible to determine what should be done when an object is passed between threads at different transaction versions without a potentially extensive relationship graph.
Instead, there are several ways to represent instances in ways that can be safely passed between threads. For example, an object with a primary key can be represented by its primary key’s value; or an RLMResults can be represented by its NSPredicate or query string; or an RLMRealm can be represented by its RLMRealmConfiguration. The target thread can then re-fetch the RLMRealm, RLMObject, RLMResults, or RLMArray using its thread-safe representation. Keep in mind that re-fetching will retrieve an instance at the version of the target thread, which may differ from the originating thread.