Where should NSManagedObjectContext be created? - ios

I've recently been learning about Core Data and specifically how to do inserts with a large number of objects. After learning how to do this and solving a memory leak problem that I met, I wrote the Q&A Memory leak with large Core Data batch insert in Swift.
After changing NSManagedObjectContext from a class property to a local variable and also saving inserts in batches rather than one at a time, it worked a lot better. The memory problem cleared up and the speed improved.
The code I posted in my answer was
let batchSize = 1000
// do some sort of loop for each batch of data to insert
while (thereAreStillMoreObjectsToAdd) {
// get the Managed Object Context
let managedObjectContext = (UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext
managedObjectContext.undoManager = nil // if you don't need to undo anything
// get the next 1000 or so data items that you want to insert
let array = nextBatch(batchSize) // your own implementation
// insert everything in this batch
for item in array {
// parse the array item or do whatever you need to get the entity attributes for the new object you are going to insert
// ...
// insert the new object
let newObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
// save the context
do {
try managedObjectContext.save()
} catch {
print(error)
}
}
This method seems to be working well for me. The reason I am asking a question here, though, is two people (who know a lot more about iOS than I do) made comments that I don't understand.
#Mundi said:
It seems in your code you are using the same managed object context,
not a new one.
#MartinR also said:
... the "usual" implementation is a lazy property which creates the
context once for the lifetime of the app. In that case you are reusing
the same context as Mundi said.
Now I don't understand. Are they saying I am using the same managed object context or I should use the same managed object context? If I am using the same one, how is it that I create a new one on each while loop? Or if I should be using just one global context, how do I do it without causing memory leaks?
Previously, I had declared the context in my View Controller, initialized it in viewDidLoad, passed it as a parameter to my utility class doing the inserts, and just used it for everything. After discovering the big memory leak is when I started just creating the context locally.
One of the other reasons I started creating the contexts locally is because the documentation said:
First, you should typically create a separate managed object context
for the import, and set its undo manager to nil. (Contexts are not
particularly expensive to create, so if you cache your persistent
store coordinator you can use different contexts for different working
sets or distinct operations.)
What is the standard way to use NSManagedObjectContext?

Now I don't understand. Are they saying I am using the same managed
object context or I should use the same managed object context? If I
am using the same one, how is it that I create a new one on each while
loop? Or if I should be using just one global context, how do I do it
without causing memory leaks?
Let's look at the first part of your code...
while (thereAreStillMoreObjectsToAdd) {
let managedObjectContext = (UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext
managedObjectContext.undoManager = nil
Now, since it appears you are keeping your MOC in the App Delegate, it's likely that you are using the template-generated Core Data access code. Even if you are not, it is highly unlikely that your managedObjectContext access method is returning a new MOC each time it is called.
Your managedObjectContext variable is merely a reference to the MOC that is living in the App Delegate. Thus, each time through the loop, you are merely making a copy of the reference. The object being referenced is the exact same object each time through the loop.
Thus, I think they are saying that you are not using separate contexts, and I think they are right. Instead, you are using a new reference to the same context each time through the loop.
Now, your next set of questions have to do with performance. Your other post references some good content. Go back and look at it again.
What they are saying is that if you want to do a big import, you should create a separate context, specifically for the import (Objective C since I have not yet made time to learn Swift).
NSManagedObjectContext moc = [[NSManagedObjectContext alloc]
initWithConcurrencyType:NSPrivateQueueConcurrencyType];
You would then attach that MOC to the Persistent Store Coordinator. Using performBlock you would then, in a separate thread, import your objects.
The batching concept is correct. You should keep that. However, you should wrap each batch in an auto release pool. I know you can do it in swift... I'm just not sure if this is the exact syntax, but I think it's close...
autoreleasepool {
for item in array {
let newObject = NSEntityDescription.insertNewObjectForEntityForName ...
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
}
In pseudo-code, it would all look something like this...
moc = createNewMOCWithPrivateQueueConcurrencyAndAttachDirectlyToPSC()
moc.performBlock {
while(true) {
autoreleasepool {
objects = getNextBatchOfObjects()
if (!objects) { break }
foreach (obj : objects) {
insertObjectIntoMoc(obj, moc)
}
}
moc.save()
moc.reset()
}
}
If someone wants to turn that pseudo-code into swift, it's fine by me.
The autorelease pool ensures that any objects autoreleased as a result of creating your new objects are released at the end of each batch. Once the objects are released, the MOC should have the only reference to objects in the MOC, and once the save happens, the MOC should be empty.
The trick is to make sure that all object created as part of the batch (including those representing the imported data and the managed objects themselves) are all created inside the autorelease pool.
If you do other stuff, like fetching to check for duplicates, or have complex relationships, then it is possible that the MOC may not be entirely empty.
Thus, you may want to add the swift equivalent of [moc reset] after the save to ensure that the MOC is indeed empty.

This is a supplemental answer to #JodyHagins' answer. I am providing a Swift implementation of the pseudocode that was provided there.
let managedObjectContext = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
managedObjectContext.persistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator // or wherever your coordinator is
managedObjectContext.performBlock { // runs asynchronously
while(true) { // loop through each batch of inserts
autoreleasepool {
let array: Array<MyManagedObject>? = getNextBatchOfObjects()
if array == nil { break }
for item in array! {
let newEntityObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
}
// only save once per batch insert
do {
try managedObjectContext.save()
} catch {
print(error)
}
managedObjectContext.reset()
}
}
These are some more resources that helped me to further understand how the Core Data stack works:
Core Data Stack in Swift – Demystified
My Core Data Stack

Related

Does managedObjectContext.object(with:) always refetch data if another (private) managedObjectContext changed and saved it?

(I'm sorry if this question is kind of confusing/imprecise. I'm just learning advanced CoreData usage and I don't know the terminology and stuff very well).
I have a singleton Game that holds certain data you need during the game. For example, you can access currentSite (Site is a CoreData Entity) from there to get the Site the user is currently at:
// I created the Site in a background queue (when the game started), then saved the objectID and here I load the objectID
public var currentSiteObjectID: NSManagedObjectID {
let objectIDuri = UserDefaults.standard.url(forKey: Keys.forGameObject.currentSiteObjectIDURI)!
return appDelegate.persistentContainer.persistentStoreCoordinator.managedObjectID(forURIRepresentation: objectIDuri)!
}
// managedObjectContext is the one running on the main queue
public var currentSite: Site! {
return managedObjectContext.object(with: currentSiteObjectID) as! Site
}
You see, I retrieve the currentSite by using the managedObjectContext.object(with:) method.
The documentation of this method says:
Returns the object for a specified ID.
If the object is not registered
in the context, it may be fetched or returned as a fault. (...)
I'm not quite sure about the following:
// Each Site has resources that you can access like this
print(Game.shared.currentSite!.resourceSet!.iron)
appDelegate.persistentContainer.performBackgroundTask { (context) in
let currentSite = context.object(with: Game.shared.currentSiteObjectID) as! Site
// Here I increase the iron resource
currentSite.resourceSet!.iron += 42
do {
try context.save()
} catch let error as NSError {
fatalError("\(error.debugDescription)")
}
DispatchQueue.main.async {
print(Game.shared.currentSite!.resourceSet!.iron)
}
}
The second print function is using the managedObjectContext of the main queue (which is different to the private one used inside performBackgroundTask {...}).
It actually does print:
50 // the start value
92
My question: Is it guaranteed that managedObjectContext.object(with:) returns the current object (that is up-to-date), even if it has been changed in another context? The documentation says that it will be fetched if it's a new object that's not known to the context.
But what if an object changes?
I'm not sure if it's just a coincidence that the example code from above is working like expected.
Thanks for any help/explanation! I'm eager to learn about this kind of stuff.
No it is not guaranteed. If managed object is already registered in context then it will return this object. What's more, if object with given id (NSManagedObjectId) doesn't exist in persistent store then your app will crash as soon as you try to use any of its properties.

iOS: Synchronizing access to CoreData

I'm new to CoreData and I'm trying to create a simple application.
Assume I have a function:
func saveEntry(entry: Entry) {
let moc = NSManagedObjectContext(concurrencyType: .NSPrivateQueueConcurrencyType)
moc.parentContext = savingContext
moc.pefrormBlockAndWait {
// find if MOC has entry
// if not => create
// else => update
// saving logic here
}
}
It can introduce a problem: if I call saveEntry from two threads, passing the same entry it will duplicate it. So I've added serial queue to my DB adapter and doing in following manner:
func saveEntry(entry: Entry) {
dispatch_sync(serialDbQueue) { // (1)
let moc = NSManagedObjectContext(concurrencyType: .NSPrivateQueueConcurrencyType)
moc.parentContext = savingContext
moc.pefrormBlockAndWait { // (2)
// find if MOC has entry
// if not => create
// else => update
// saving logic here
}
}
}
And it works fine, until I'd like to add another interface function:
func saveEntries(entries: [Entry]) {
dispatch_sync(serialDbQueue) { // (3)
let moc = NSManagedObjectContext(concurrencyType: .NSPrivateQueueConcurrencyType)
moc.parentContext = savingContext
moc.pefrormBlockAndWait {
entries.forEach { saveEntry($0) }
}
}
}
And now I have deadlock: 1 will be called on serialDbQueue and wait till saving finishes. 2 will be called on private queue and will wait for 3. And 3 is waiting for 1.
So what is correct way to handle synchronizing access? As far as I understand it's not safe to keep one MOC and perform saves on it because of reasons described here: http://saulmora.com/coredata/magicalrecord/2013/09/15/why-contextforcurrentthread-doesn-t-work-in-magicalrecord.html
I would try to implement this with a single NSManagedObjectContext as the control mechanism. Each context maintains a serial operation queue so multiple threads can call performBlock: or performBlockAndWait: without any danger of concurrent access (though you must be cautious of the context's data changing between the time the block is enqueued and when it eventually executes). As long as all work within the context is being done on the correct queue (via performBlock) there's no inherent danger in enqueuing work from multiple threads.
There are of course some complications to consider and I can't offer real suggestions without knowing much more about your app.
What object will be responsible for creating this context and how will it be made available to every object which needs it?
With a shared context it becomes difficult to know when work on that context is "finished" (it's operation queue is empty) if that represents a meaningful state in your app.
With a shared context it is more difficult to abandon changes should you you want to discard unsaved modifications in the event of an error (you'll need to actually revert those changes rather than simply discard the context without saving).

Memory leak with large Core Data batch insert in Swift

I am inserting tens of thousands of objects into my Core Data entity. I have a single NSManagedObjectContext and I am calling save() on the managed object context every time I add an object. It works but while it is running, the memory keeps increasing from about 27M to 400M. And it stays at 400M even after the import is finished.
There are a number of SO questions about batch insert and everyone says to read Efficiently Importing Data, but it's in Objective-C and I am having trouble finding real examples in Swift that solve this problem.
There are a few things you should change:
Create a separate NSPrivateQueueConcurrencyType managed object context and do your inserts asynchronously in it.
Don't save after inserting every single entity object. Insert your objects in batches and then save each batch. A batch size might be something like 1000 objects.
Use autoreleasepool and reset to empty the objects in memory after each batch insert and save.
Here is how this might work:
let managedObjectContext = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
managedObjectContext.persistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator // or wherever your coordinator is
managedObjectContext.performBlock { // runs asynchronously
while(true) { // loop through each batch of inserts
autoreleasepool {
let array: Array<MyManagedObject>? = getNextBatchOfObjects()
if array == nil { break }
for item in array! {
let newObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
}
// only save once per batch insert
do {
try managedObjectContext.save()
} catch {
print(error)
}
managedObjectContext.reset()
}
}
Applying these principles kept my memory usage low and also made the mass insert faster.
Further reading
Efficiently Importing Data (old Apple docs link is broken. If you can find it, please help me add it.)
Core Data Performance
Core Data (General Assembly post)
Update
The above answer is completely rewritten. Thanks to #Mundi and #MartinR in the comments for pointing out a mistake in my original answer. And thanks to #JodyHagins in this answer for helping me understand and solve the problem.

iOS - Core Data Stack as singleton with main NSManagedObjectContext

I've seen many tutorials and they really help me with understand parent-child managed object context and other things related to this. I am ready to start using it in my app but I have a question. Why nobody use singleton for keeping main managed object context. I guess it would be much better to extract Core Data related objects from AppDelegate and set it to own class right? Something like in this Tutorial at raywenderlich.com. But they still instantiate CoreDataStack class (no problem with this, singleton must be instantiate too) and when it's need they set managedObjectContext in prepareForSegue (and set it to first view controller from AppDelegate). Why not to remove this need and just use singleton CoreDataStack and have possible to use managedObjectContext in each controller if needed?
Second and bonus question: I think it's better to have less code in controller and more in other classes. I think it helps with readability. So what if I move this code from controller and set it for example to CoreDataStack class or some other class that helps with Core Data requests and responses:
func surfJournalFetchRequest() -> NSFetchRequest {
let fetchRequest =
NSFetchRequest(entityName: "JournalEntry")
fetchRequest.fetchBatchSize = 20
let sortDescriptor =
NSSortDescriptor(key: "date", ascending: false)
fetchRequest.sortDescriptors = [sortDescriptor]
return fetchRequest
}
I know it's possible but is it better? If you get app codes from me would it be better if in controller it would be one line CoreDataStack.fetchRequest("JournalEntry", sortedKey: "date")?
And what about if I take this code and insert it to singleton and created function with closure? I would created child managed context in singleton and do needed operations in there and in controller I would just changed UI:
func exportCSVFile() {
navigationItem.leftBarButtonItem = activityIndicatorBarButtonItem()
let privateContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType)
privateContext.persistentStoreCoordinator = coreDataStack.context.persistentStoreCoordinator
privateContext.performBlock { () -> Void in
var fetchRequestError:NSError? = nil
let results = privateContext.executeFetchRequest(self.surfJournalFetchRequest(), error: &fetchRequestError)
if results == nil {
println("ERROR: \(fetchRequestError)")
}
let exportFilePath = NSTemporaryDirectory() + "export.csv"
let exportFileURL = NSURL(fileURLWithPath: exportFilePath)!
NSFileManager.defaultManager().createFileAtPath(exportFilePath, contents: NSData(), attributes: nil)
var fileHandleError: NSError? = nil
let fileHandle = NSFileHandle(forWritingToURL: exportFileURL, error: &fileHandleError)
if let fileHandle = fileHandle {
for object in results! {
let journalEntry = object as! JournalEntry
fileHandle.seekToEndOfFile()
let csvData = journalEntry.csv().dataUsingEncoding(NSUTF8StringEncoding, allowLossyConversion: false)
fileHandle.writeData(csvData!)
}
fileHandle.closeFile()
dispatch_async(dispatch_get_main_queue(), { () -> Void in
self.navigationItem.leftBarButtonItem =
self.exportBarButtonItem()
println("Export Path: \(exportFilePath)")
self.showExportFinishedAlertView(exportFilePath)
})
} else {
dispatch_async(dispatch_get_main_queue(), { () -> Void in
self.navigationItem.leftBarButtonItem = self.exportBarButtonItem()
println("ERROR: \(fileHandleError)")
})
}
}
}
I just want to be sure that my aproach would be okay and would be better than original. Thanks
I built my first core data app with a singleton pattern. It seemed logical for me because there is only one core data stack anyway. I was very wrong, the singleton pattern turned into a big mess quickly. I added more and more code to bend the singleton stack to something that works. In the end I gave up and I invested the time to replace the singleton mess with dependency injection.
Here are some of the problems I encountered before I dumped the singleton:
Since the app kept important data, my users requested a backup functionality. To restore from a backup I switched the sqlite file and then I would just create a new Core Data stack. Doing this in a clean way is next to impossible if you use a pull-pattern to get the managedObjectContext from a singleton. So my way to switch the Core Data stack was to tell the user that they have to restart the app. Followed by an exit(). Not the most elegant way to handle this.
After Apple added childContexts I decided to get rid of undo managers and context rollbacks, because that never worked 100% for me. But changing my editing viewControllers so they use child contexts which are discarded when the user hits cancel, was an incredible painful act because I now had a mix of singleton contexts and viewController local contexts in one viewController.
For editing the targets of relationships I had editViewControllers inside editViewController. Because I created the edit context inside the edit viewControllers I ended up saving data to the main context that shouldn't have been saved. It's a bit complicated to explain, but the second viewController saved stuff like new objects to the main context even if the user in the outer edit viewController hit cancel. Which always lead to orphaned objects. So I added more code to bend the singleton in a way that would make it less of a singleton.
I also had a CSV import function and I wanted to preview the data to the user before they press "Import". I build a totally new infrastructure for that. First I parsed the CSV into a data structure that basically duplicated my core data classes. Then I build a viewController to display these non core data classes, with even more code duplication. I would only start to create core data objects when the user pressed import.
After I got rid of the singleton pattern I could reuse the existing data display viewController. I would just give it a different context, in this case an in-memory context that contained the data that will be imported. Much cleaner, less duplicated code.
I guess some of these problems were not really the singletons fault. I was just very inexperienced.
But I still would strongly recommend against singleton core data.
would be one line CoreDataStack.fetchRequest("JournalEntry", sortedKey: "date")?
You don't need a singleton for this. Stuff like this should be in the NSManagedObject subclass you create for JournalEntry.
And what about if I take this code and insert it to singleton and created function with closure? I would created child managed context in singleton and do needed operations in there and in controller I would just changed UI:
And why don't you create a method that doesn't require internal state at all?
class func export(#context: NSManagedObjectContext, toCSVAtPath path: String,
progress: ((current: Int, totalCount: Int) -> Void)?,
completion: ((success: Bool, error: NSError?) -> Void)?) {
Much more flexible.

NSFetchedResultsController plus NSBatchUpdateRequest equals NSMergeConflict. What do I do wrong?

I got a NSFetchedResultsController that I set up using a NSManagedObjectContext. I perform a fetch using this context.
I have as well a NSBatchUpdateRequest that I set up using the same NSManagedObjectContext. I execute the request using the same NSManagedObjectContext.
When I perform the request with the NSBatchUpdateRequest, I can see that all my data have been updated.
If I restart the app, any fetch using NSFetchedResultsController is working as well.
THe problem is when I'm not restarting the app and that I do both operations one after one, I got a NSMergeConflict (0x17427a900) for NSManagedObject (0x1740d8d40) with objectID '0xd000000001b40000... error when I call the method save from my context.
I know that the problem comes from concurrent change on the same data but I don't know what is the solution? One might be to go through the NSMergePolicy class, but I doubt that's a clean way to solve my problem.
What should I do? Have two different contexts ? (how?)
Well it seems I might have found how to do it, but if you see anything wrong, please let me know.
When you do a batch update, you have the possibility to get as a result, whether nothing, the number of rows that were updated or a list of object IDs that were updated. You have to choose the last one.
Once you perform executeRequest from the context, you need to get the list of object IDs, loop through all of them to get every NSManagedObject into Faults thanks to the method objectWithID of the context object. If you don't know what Faults object are in Core Data, here is the explanation.
With every NSManagedObject you get, you need to refresh the context using its method refreshObject.
Once you've done that, you need to perform again the performFetch of your fetchedResultsController to come back to where you were before the batch update.
Tell me if I'm wrong somewhere.
Here is the code:
let batchUpdate = NSBatchUpdateRequest(entityName: "myEntity")
batchUpdate.propertiesToUpdate = ["myPropertieToUpdate" : currency.amountToCompute]
batchUpdate.affectedStores = managedContext.persistentStoreCoordinator?.persistentStores
batchUpdate.resultType = .UpdatedObjectIDsResultType
var batchError: NSError?
let batchResult = managedContext.executeRequest(batchUpdate, error: &batchError) as NSBatchUpdateResult?
if let result = batchResult {
println("Records updated \((result.result as [NSManagedObjectID]).count)")
// Extract Object IDs
let objectIDs = result.result as [NSManagedObjectID]
for objectID in objectIDs {
// Turn Managed Objects into Faults
let nsManagedObject: NSManagedObject = managedContext.objectWithID(objectID)
if let managedObject = nsManagedObject as NSManagedObject? {
managedContext.refreshObject(managedObject, mergeChanges: false)
}
}
// Perform Fetch
var error: NSError? = nil
if !fetchedResultsController.performFetch(&error) {
println("error: + \(error?.localizedDescription), \(error!.userInfo)")
}
} else {
println("Could not update \(batchError), \(batchError!.userInfo)")
}
EDIT:
Here are two links for more explanations:
http://code.tutsplus.com/tutorials/ios-8-core-data-and-batch-updates--cms-22164
http://www.bignerdranch.com/blog/new-in-core-data-and-ios-8-batch-updating/

Resources