When multiple perform() calls are invoked on the same NSManagedObjectContext object, will they be executed one by one in the order they are invoked? I think this is true because the document says
Core Data uses thread (or serialized queue) confinement to protect
managed objects and managed object contexts (see Core Data Programming
Guide).
which suggests that managed object context and its thread has 1:1 mapping and all perform() calls are serial. But it scares me that I can't find any explicit discussion on this, not even in Apple's doc.
In my App, I set up a CoreData stack with NSPersistentContainer and create a dedicated background context for modifying managed objects. It could occur that when a perform() call is invoked, the previous perform() call hasn't finished yet. So it's critical that they are executed one by one in this case. That's why I'd like to confirm my understanding above.
Note: I understand perform() is asynchronous, but that's from the caller's perspective. What I'm asking about is from the callee's perspective.
Yes the multiple perform calls will be queued up and executed in that same order.
Related
To test my Core Data implementation I have enabled the launch argument com.apple.CoreData.ConcurrencyDebug 1. I am getting breakpoints triggered whenever I access a managed object from within a Task using Swifts new async APIs.
I use a single context (viewContext) for fetching and ephemeral background contexts to perform write operations. See some snippets of the important parts at the bottom.
My app functions perfectly and I get no breakpoints triggered except for the scenarios where I access a Core Data managed object from within a Task.
See an example here
func performReloadForecasts() {
Task {
await reloadForecasts()
}
}
The method reloadForecasts is too long to post here but it references the users favourite location (Hence the locations) and uses it to fetch the forecast from an API for that location.
Referencing the users locations in views or view models functions fine.
But from what I can tell, by using a Task to perform an asynchronous operation, I am performing the task in a whole different thread that is chosen (from a pool?) at run time.
Usually the thread that the Task is run in is com.apple.root.user-initiated-qos.cooperative (serial) which makes sense I suppose.
How can I refactor or change either my asynchronous functions (such as reloadForecast) or my core data stack (detailed below) to perform operations in a manor that abides by the concurrency rules for Core Data?
Can I force a Task to run on a particular thread? Since my main context is the viewContext it would have to be the main thread, which sort of defeats the purpose of the async Task.
Can I refactor my core data stack to create some sort of thread safe reference to my managed objects? I have seen suggestions to pass object ID's in and refetch the object inside the target thread but surely there is a more elegant way to abide by the concurrency rules.
Core Data Implementation
Context Setup
container = NSPersistentCloudKitContainer(name: "Model")
context = container.viewContext
context.automaticallyMergesChangesFromParent = true
context.mergePolicy = NSMergeByPropertyStoreTrumpMergePolicy
Fetching
#Published private var locations: [LocationModel] = []
private func reloadData() {
context.perform { [context] in
do {
self.locations = try context.fetch(LocationModel.fetchRequest())
} catch {
Logger.error("Failed to reload persistence layer.")
Logger.error(error.localizedDescription)
}
}
}
Performing Write Operations
func perform(_ block: #escaping (NSManagedObjectContext) -> Void) {
do {
let context = container.newBackgroundContext()
try context.performAndWait {
block(context)
try context.save()
}
} catch {
Logger.error(error.localizedDescription)
}
}
I fixed some of these issues by adding the #MainActor flag to the Task
Task { #MainActor in
await reloadForecasts()
}
However I was still getting breakpoints for certain issues, especially higher order functions like maps or sorts. I ended up adding the #MainActor wrapper to all my view models.
This fixed all of the weird crashes regarding the higher order functions accessing the core data objects, but I faced new problems. My second core data context used for saving objects was now the cause of concurrency breakpoints being triggered.
This made more sense and was much more debug-able. I had strong references to objects fetched in the main context, used to construct another object in the background context.
Model A <---> Model B
I had Model A that had relationship to another Model B. To setup the relationship to Model B, I used a reference to a Model B object that had been fetched on the main context. But I was creating the Model A object in the background thread.
To solve this I used the suggested methods of refetching the required objects by ObjectID in the correct context. (Using a bunch of nice helper methods to make things easier)
Here's a forum post asking a related question about ensuring asynchronous tasks are run on the main thread
https://forums.swift.org/t/best-way-to-run-an-anonymous-function-on-the-main-actor/50083
My understanding of the new swift concurrency models is that when you await on an async function, the task is run on another thread chosen from a pool and when the task is complete, execution returns to the point (and thread) you used await.
In this case I have forced the Task to start on the main thread (By using #MainActor), execute its task on a thread from the available pool, and return back to the main thread once its completed.
The swift concurrency explains some of this in detail: https://docs.swift.org/swift-book/LanguageGuide/Concurrency.html
I use NSPersistentContainer as a dependency in my classes. I find this approach quite useful, but there is a dilemma: I don't know in which thread my methods will be called. I found a very simple solution for this
extension NSPersistentContainer {
func getContext() -> NSManagedObjectContext {
if Thread.isMainThread {
return viewContext
} else {
return newBackgroundContext()
}
}
}
Looks wonderful but I still have a doubt is there any pitfalls? If it properly works, why on earth Core Data confuses us with its contexts?
It's OK as long as you can live with its inherent limitations, i.e.
When you're on the main queue, you always want the viewContext, never any other one.
When you're not on the main queue, you always want to create a new, independent context.
Some drawbacks that come to mind:
If you call a method that has an async completion handler, that handler might be called on a different queue. If you use this method, you might get a different context than when you made the call. Is that OK? It depends what you're doing in the handler.
Changes on one background context are not automatically available in other background contexts, so you run the risk of having multiple contexts with conflicting changes.
The method suggests a potential carelessness about which context you're using. It's important to be aware of which context you're using, because managed objects fetched on one can't be used with another. If your code just says, hey give me some context, but doesn't track the contexts properly, you increase the chance of crashes from accidentally crossing contexts.
If your non-main-queue requirements match the above, you're probably better off using the performBackgroundTask(_:) method on NSPersistentContainer. You're not adding anything to that method here.
[W]hy on earth Core Data confuses us with its contexts?
Managed object contexts are a fundamental part of how Core Data works. Keeping track of them is therefore a fundamental part of having an app that doesn't corrupt its data or crash.
Can I have a single Private Managed Object context that is being accessed by Multiple NSOperation ?
I have 2 two options :
Have a managed object context per NSOperation.
i.e if there are 100 NSoperation 100 context will be created.
Have a single context and multiple NSOperation.
i.e Single Context and 100 NSOperations accessing it.
Which can be a better option.
The correction solution is option 1. Create a queue with a concurrency count of 1 and do all your writing with the queue. This will avoid any write conflict which can lead to losing information. If you need to access information for the main thread you should use a global main thread context (in NSPersistentContainer it is call viewContext).
If this is going to slow then you should investigate the work that you are doing. Generally each operation should be pretty quick, so if you are finding that they are not you might be doing something wrong (a common issue is doing a fetch for each imported object - instead of one large fetch). Another solution is to split up large tasks into several smaller task (importing large amount of data). You can also set different priority - giving higher priority to actions that the user initiated.
You shouldn't be afraid of creating contexts. They are not that expensive.
According to what you said in the comments, that you didn't actually write lots of data in the single operation and just did a fetch for an object, I suggest using single MOC.
Usual reason to have multiple MOC is to read/update/save lots of data independently of Main context and of any other contexts. In such flow you would be able to save that objects concurrently from different contexts.
But it's not your case, If I understood correctly.
For the single fetch there would be enough just one Private context, however I believe there wouldn't be lots overhead in creating many contexts. But why to do extra work?
1.So, you create private MOC
let privateContext = NSManagedObjectContext(concurrencyType:.privateQueueConcurrencyType)
2.Create each operation and pass MOC
let operation = MyOperation(context: privateContext)
3.In the operation perform sync call to private MOC with function.
In such way you should avoid any concurrent problem with single MOC
func performAndWait(_ block: #escaping () -> Swift.Void)
for example
let myObject: Object?
privateContext.performAndWait {
myObject = privateContext.fetch(...)
}
// do what you need with myObject
What I'm trying to do:
perform background sync with a web API without freezing the UI. I'm using MagicalRecord but it's not really specific to it.
make sure I'm using contexts & such correctly
What my question really is: is my understanding correct? Plus a couple questions at the end.
So, the contexts made available by MagicalRecord are:
MR_rootSavingContext of PrivateQueueConcurrencyType which is used to persist data to the store, which is a slow process
MR_defaultContext of MainQueueConcurrencyType
and for background you would want to work with a context generated by MR_context(), which is a child of MR_defaultContext and is of PrivateQueueConcurrencyType
Now, for saving in an asynchronous way, we have two options:
MR_saveToPersistentStoreWithCompletion() which will save all the way up to MR_rootSavingContext and write to disk
MR_saveOnlySelfWithCompletion() which will save only up to the parent context (i?e. MR_defaultContext for a context created with MR_context)
From there, I thought that I could do the following (let's call it Attempt#1) without freezing the UI:
let context = NSManagedObjectContext.MR_context()
for i in 1...1_000 {
let user = User.MR_createInContext(context) as User
context.MR_saveOnlySelfWithCompletion(nil)
}
// I would normally call MR_saveOnlySelfWithCompletion here, but calling it inside the loop makes any UI block easier to spot
But, my assumption was wrong. I looked into MR_saveOnlySelfWithCompletion and saw that it relies on
[self performBlock:saveBlock];
which according to Apple Docs
Asynchronously performs a given block on the receiver’s queue.
So I was a bit puzzled, since I would expect it not to block the UI because of that.
Then I tried (let's call it Attempt#2)
let context = NSManagedObjectContext.MR_context()
for i in 1...1_000 {
let user = User.MR_createInContext(context) as User
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)) { () -> Void in
context.MR_saveOnlySelfWithCompletion(nil)
}
}
And this does the job, but it doesn't feel right.
Then I found something in the release notes of iOS 5.0
When sending messages to a context created with a queue
association, you must use the performBlock: or performBlockAndWait:
method if your code is not already executing on that queue (for the
main queue type) or within the scope of a performBlock... invocation
(for the private queue type). Within the blocks passed to those
methods, you can use the methods of NSManagedObjectContext freely.
So, I'm assuming that:
Attempt#1 freezes the UI because I'm actually calling it from the main queue and not within the scope of a performBlock
Attempt#2 works, but I'm creating yet another thread while the context already has its own background thread
So of course what I should do is use saveWithBlock:
MagicalRecord.saveWithBlock { (localContext) -> Void in
for i in 1...1_000 {
User.MR_createInContext(context)
}
}
This performs the operation on a direct child of MR_rootSavingContext which is of PrivateQueueConcurrencyType.
Thanks to rootContextChanged, any change that goes up to MR_rootSavingContext will be available to MR_defaultContext.
So it seems that:
MR_defaultContext is the perfect context when it comes to displaying data
edits are preferably done in an MR_context (child of MR_defaultContext)
long running tasks such as a server sync are preferably done using saveWithBlock
What it still don't get is how to work with MR_save[…]WithCompletion(). I would use it on MR_context but since it blocked the main thread in my test cases I don't see when it becomes relevant (or what I missed…).
Thanks for your time :)
Ok, I am rarely using magical records but since you said you question is more general I will attempt an answer.
Some theory: When creating a context you pass an indicator as to whether you want it to be bound on the main or a background thread
let context = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
By "bound" we mean that a thread is referenced by the context internally. In the example above a new thread is created and owned by the context. This thread is not used automatically but must be called explicitly as:
context.performBlock({ () -> Void in
context.save(nil)
return
});
So your code with 'dispatch_async' is wrong because the thread the context is bound to can only be referenced from the context itself (it is a private thread).
What you have to infer from the above is that if the context is bound to the main thread, calling performBlock from the main thread will not do anything different that calling context methods straight.
To comment on your bullet points at the end:
MR_defaultContext is the perfect context when it comes to displaying data: An NSManagedObject must be accessed from the context it is
created so it is actually the only context that you can feed the
UI from.
edits are preferably done in an MR_context (child of MR_defaultContext): Edits are not expensive and you should follow
the rule above. If you are calling a function that edits an NSManagedObject's properties from the main thread (like at the tap of a button)
you should update the main context. Saves on the other hand are
expensive and this is why your main context should not be linked to a
persistent store directly but just push its edits down to a root
context with background concurrency owning a persistent store.
long running tasks such as a server sync are preferably done using saveWithBlock Yes.
Now, In attempt 1
for i in 1...1_000 {
let user = User.MR_createInContext(context) as User
}
context.MR_saveOnlySelfWithCompletion(nil)
There is no need to save for every object creation. Even if the UI was not blocked it is wasteful.
About MR_context. In the documentation for magical records I cannot see a 'MR_context' so I am wondering if it is a quick method to access the main context. If it is so, it will block.
I have a method that runs in a background thread using a copy of NSManagedObjectContext which is specially generated when the background thread starts as per Apples recommendations.
In this method it makes a call to shared instance of a class, this shared instance is used for managing property values.
The shared instance that managed properties uses a NSManagedObjectContext on the main thread, now even though the background thread method should not use the NSManagedObjectContext on the main thread, it shouldn't really matter if the shared property manager class does or does not use the such a context as it only returns scalar values back to the background thread (at least that's my understanding).
So, why does the shared property class hang when retrieving values via the main threads context when called from the background thread? It doesn't need to pass an NSManagedObject or even update one so I cannot see what difference it would make.
I can appreciate that my approach is probably wrong but I want to understand at a base level why this is. At the moment I cannot understand this whole system enough to be able to think beyond Apples recommended methods of implementation and that's just a black magic approach which I don't like.
Any help is greatly appreciated.
Does using:
[theContext performBlock:^{
// do stuff on the context's queue, launch asynchronously
}];
-- or --
[theContext performBlockAndWait:^{
// do stuff on the context's queue, run synchronously
}];
-- just work for you? If so, you're done.
If not, take a long, hard look at how your contexts are setup, being passed around, and used. If they all share a root context, you should be able to "move" data between them easily, so long as you lookup any objectIDs always on your current context.
Contexts are bound to threads/queues, basically, so always use a given context as a a reference for where to do work. performBlock: is one way to do this.