Moving CoreData data blob into separate object - ios

I am moving a NSData property out of a CoreData object and into a separate object, so self.pdfData becomes self.pdf.data. Does this look like the right approach to manage creation and deletion of the secondary object?
- (void)setPdfData:(NSData *)pdfData
{
if (!pdfData) {
if (self.pdf) {
[self.managedObjectContext deleteObject:self.pdf];
self.pdf = nil;
}
}
else {
if (!self.pdf) {
self.pdf = [BaseFormPDF insertInManagedObjectContext:self.managedObjectContext];
}
self.pdf.data = pdfData;
}
}
- (NSData *)pdfData
{
return self.pdf.data;
}

Yes, this is a good approach.
1) by moving the data to a separate entity you can fetch the main entity without loading the large data into memory.
2) psudo properties on managedObjects is are really cool and work very well for things like this. But I would be worried about doing too much in a setter. In this case I think it is OK, but doing more can cause issues. If a programmer is just setting thing.pdfData = data and lots of stuff is happening that the programmer didn't expect that could cause bugs.

Related

Coredata duplicates all objects in a list on update

I've come across an issue which rarely happens, and (of course) it works perfectly when I test it myself. It has only happened for a few users, and I know I have at least a couple hundred who use the same App daily.
The issue
When updating a list of coredata objects in a tableview, it not only updates the objects (correctly), it also creates duplicates of all these objects.
Coredata setup
It's a NSPersistentCloudKitContainer with these settings:
container.viewContext.automaticallyMergesChangesFromParent = true
container.viewContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
Tableview setup
I have a tableview which displays a list of the 'ActivityType' objects. It's very simple, they have a name (some other basic string/int properties), and an integer called 'index'. This 'index' exists so that users can change the order in which they should be displayed.
Here is some code for how each row is setup:
for activityType in activityTypes {
row = BlazeRow()
row.title = activityType.name
row.cellTapped = {
self.selectedActivityType(activityType)
}
row.object = activityType
row.cellReordered = {
(index) in
self.saveNewOrder()
}
section.addRow(row)
}
As you can see, it has 2 methods. One for selecting the activity which shows its details in a new viewcontroller, and one which is called whenever the order is changed.
Here's the method that is called whenever the order is changed:
func saveNewOrder() {
Thread.printCurrent()
let section = self.tableArray[0] as! BlazeSection
for (index, row) in section.rows.enumerated() {
let blazeRow = row as! BlazeRow
let object = blazeRow.object as! ActivityType
object.index = Int32(index)
}
BDGCoreData.saveContext()
}
And here's the code that saves the context (I use a singleton to easily access the viewcontext):
class func saveContext(context: NSManagedObjectContext = BDGCoreData.viewContext) {
if(context.hasChanges) {
do {
try context.save()
} catch {
let nserror = error as NSError
fatalError("Unresolved error \(nserror), \(nserror.userInfo)")
}
}
}
Now, I swear to god it never calls the method in this viewcontroller to create a new object:
let activity = ActivityType(context: BDGCoreData.viewContext). I know how that works, and it truly is only called in a completely different view controller. I searched for it again in my entire project just in case, and it's really never called/created in any other places.
But somehow, in very rase cases, it saves the correct new order but also duplicates all objects in this list. Since it only happens rarely, I thought it might have something to do with threads? Which is why, as you can see in the code, I printed out the current thread, but at least when testing on my device, it seems to be on the main thread.
I'm truly stumped. I have a pretty good understanding of coredata and the app itself is quite complex with full of objects with different kind of relationships.
But why this happens? I have no clue...
Does anyone have an idea?

Where should NSManagedObjectContext be created?

I've recently been learning about Core Data and specifically how to do inserts with a large number of objects. After learning how to do this and solving a memory leak problem that I met, I wrote the Q&A Memory leak with large Core Data batch insert in Swift.
After changing NSManagedObjectContext from a class property to a local variable and also saving inserts in batches rather than one at a time, it worked a lot better. The memory problem cleared up and the speed improved.
The code I posted in my answer was
let batchSize = 1000
// do some sort of loop for each batch of data to insert
while (thereAreStillMoreObjectsToAdd) {
// get the Managed Object Context
let managedObjectContext = (UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext
managedObjectContext.undoManager = nil // if you don't need to undo anything
// get the next 1000 or so data items that you want to insert
let array = nextBatch(batchSize) // your own implementation
// insert everything in this batch
for item in array {
// parse the array item or do whatever you need to get the entity attributes for the new object you are going to insert
// ...
// insert the new object
let newObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
// save the context
do {
try managedObjectContext.save()
} catch {
print(error)
}
}
This method seems to be working well for me. The reason I am asking a question here, though, is two people (who know a lot more about iOS than I do) made comments that I don't understand.
#Mundi said:
It seems in your code you are using the same managed object context,
not a new one.
#MartinR also said:
... the "usual" implementation is a lazy property which creates the
context once for the lifetime of the app. In that case you are reusing
the same context as Mundi said.
Now I don't understand. Are they saying I am using the same managed object context or I should use the same managed object context? If I am using the same one, how is it that I create a new one on each while loop? Or if I should be using just one global context, how do I do it without causing memory leaks?
Previously, I had declared the context in my View Controller, initialized it in viewDidLoad, passed it as a parameter to my utility class doing the inserts, and just used it for everything. After discovering the big memory leak is when I started just creating the context locally.
One of the other reasons I started creating the contexts locally is because the documentation said:
First, you should typically create a separate managed object context
for the import, and set its undo manager to nil. (Contexts are not
particularly expensive to create, so if you cache your persistent
store coordinator you can use different contexts for different working
sets or distinct operations.)
What is the standard way to use NSManagedObjectContext?
Now I don't understand. Are they saying I am using the same managed
object context or I should use the same managed object context? If I
am using the same one, how is it that I create a new one on each while
loop? Or if I should be using just one global context, how do I do it
without causing memory leaks?
Let's look at the first part of your code...
while (thereAreStillMoreObjectsToAdd) {
let managedObjectContext = (UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext
managedObjectContext.undoManager = nil
Now, since it appears you are keeping your MOC in the App Delegate, it's likely that you are using the template-generated Core Data access code. Even if you are not, it is highly unlikely that your managedObjectContext access method is returning a new MOC each time it is called.
Your managedObjectContext variable is merely a reference to the MOC that is living in the App Delegate. Thus, each time through the loop, you are merely making a copy of the reference. The object being referenced is the exact same object each time through the loop.
Thus, I think they are saying that you are not using separate contexts, and I think they are right. Instead, you are using a new reference to the same context each time through the loop.
Now, your next set of questions have to do with performance. Your other post references some good content. Go back and look at it again.
What they are saying is that if you want to do a big import, you should create a separate context, specifically for the import (Objective C since I have not yet made time to learn Swift).
NSManagedObjectContext moc = [[NSManagedObjectContext alloc]
initWithConcurrencyType:NSPrivateQueueConcurrencyType];
You would then attach that MOC to the Persistent Store Coordinator. Using performBlock you would then, in a separate thread, import your objects.
The batching concept is correct. You should keep that. However, you should wrap each batch in an auto release pool. I know you can do it in swift... I'm just not sure if this is the exact syntax, but I think it's close...
autoreleasepool {
for item in array {
let newObject = NSEntityDescription.insertNewObjectForEntityForName ...
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
}
In pseudo-code, it would all look something like this...
moc = createNewMOCWithPrivateQueueConcurrencyAndAttachDirectlyToPSC()
moc.performBlock {
while(true) {
autoreleasepool {
objects = getNextBatchOfObjects()
if (!objects) { break }
foreach (obj : objects) {
insertObjectIntoMoc(obj, moc)
}
}
moc.save()
moc.reset()
}
}
If someone wants to turn that pseudo-code into swift, it's fine by me.
The autorelease pool ensures that any objects autoreleased as a result of creating your new objects are released at the end of each batch. Once the objects are released, the MOC should have the only reference to objects in the MOC, and once the save happens, the MOC should be empty.
The trick is to make sure that all object created as part of the batch (including those representing the imported data and the managed objects themselves) are all created inside the autorelease pool.
If you do other stuff, like fetching to check for duplicates, or have complex relationships, then it is possible that the MOC may not be entirely empty.
Thus, you may want to add the swift equivalent of [moc reset] after the save to ensure that the MOC is indeed empty.
This is a supplemental answer to #JodyHagins' answer. I am providing a Swift implementation of the pseudocode that was provided there.
let managedObjectContext = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
managedObjectContext.persistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator // or wherever your coordinator is
managedObjectContext.performBlock { // runs asynchronously
while(true) { // loop through each batch of inserts
autoreleasepool {
let array: Array<MyManagedObject>? = getNextBatchOfObjects()
if array == nil { break }
for item in array! {
let newEntityObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
}
// only save once per batch insert
do {
try managedObjectContext.save()
} catch {
print(error)
}
managedObjectContext.reset()
}
}
These are some more resources that helped me to further understand how the Core Data stack works:
Core Data Stack in Swift – Demystified
My Core Data Stack

Memory leak with large Core Data batch insert in Swift

I am inserting tens of thousands of objects into my Core Data entity. I have a single NSManagedObjectContext and I am calling save() on the managed object context every time I add an object. It works but while it is running, the memory keeps increasing from about 27M to 400M. And it stays at 400M even after the import is finished.
There are a number of SO questions about batch insert and everyone says to read Efficiently Importing Data, but it's in Objective-C and I am having trouble finding real examples in Swift that solve this problem.
There are a few things you should change:
Create a separate NSPrivateQueueConcurrencyType managed object context and do your inserts asynchronously in it.
Don't save after inserting every single entity object. Insert your objects in batches and then save each batch. A batch size might be something like 1000 objects.
Use autoreleasepool and reset to empty the objects in memory after each batch insert and save.
Here is how this might work:
let managedObjectContext = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
managedObjectContext.persistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator // or wherever your coordinator is
managedObjectContext.performBlock { // runs asynchronously
while(true) { // loop through each batch of inserts
autoreleasepool {
let array: Array<MyManagedObject>? = getNextBatchOfObjects()
if array == nil { break }
for item in array! {
let newObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
newObject.attribute1 = item.whatever
newObject.attribute2 = item.whoever
newObject.attribute3 = item.whenever
}
}
// only save once per batch insert
do {
try managedObjectContext.save()
} catch {
print(error)
}
managedObjectContext.reset()
}
}
Applying these principles kept my memory usage low and also made the mass insert faster.
Further reading
Efficiently Importing Data (old Apple docs link is broken. If you can find it, please help me add it.)
Core Data Performance
Core Data (General Assembly post)
Update
The above answer is completely rewritten. Thanks to #Mundi and #MartinR in the comments for pointing out a mistake in my original answer. And thanks to #JodyHagins in this answer for helping me understand and solve the problem.

Wait on thread started from another class - Objective C

I have encountered a threading issue I cannot solve. I want to perform a large Core Data save operation of about 12000 objects on a separate thread in a certain class, and in another class control a button action in relation with the save operation being finished. What is the best approach on this?
This is how the save operation looks like:
Class A
-(void) saveAsync
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^
{
//arrayOfObjects holds the 12000 objects
for(aClass *object in arrayOfObjects)
{
[self saveToCoreData: object];
}
NSLog(#"Finished saving");
});
}
-(void) saveToCoreData : (aClass *) object
{
//perform save operation here
}
And this is the action method on my button (which is really nothing yet)
Class B
-(IBAction) buttonActionMethod
{
//take different actions depending on the objects being persisted to the store or still saving
}
I am asking for a solution that would allow me to know if the objects are saved at a button press. The code I provided is just a raw example to express the idea, I don't expect it to work like that. I have thought of using NSOperationQueue or create threads or use groups, but I have not found a solution that works.
Thank you in advance!
You need to store a BOOL on Class A, which will indicate the state of saving,
and check that BOOL from Class B, and make different actions.
>> Class A
#property (atomic) BOOL isSaving;
- (void)saveAsync
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^ {
//arrayOfObjects holds the 12000 objects
self.isSaving = YES;
for(aClass *object in arrayOfObjects) {
[self saveToCoreData:object];
}
self.isSaving = NO;
NSLog(#"Finished saving");
});
}
- (void)saveToCoreData:(aClass *)object
{
//perform save operation here
}
>> Class B
- (IBAction)buttonActionMethod
{
// take different actions depending on the objects being persisted to the store or still saving
if (classA.isSaving) {
} else {
}
}
You can subscribe to NSManagedObjectContextDidSaveNotification:
Posted whenever a managed object context completes a save operation.
The notification object is the managed object context. The userInfo
dictionary contains the following keys: NSInsertedObjectsKey,
NSUpdatedObjectsKey, and NSDeletedObjectsKey.
I strongly recommend NOT to use any class variables, as they store class state which is not a real part of it

AFNetworking response object mapping

I am trying out AFNetworking after coming from RestKit and I am wondering if there's a simpler solution than having to do this for all of my objects:
- (id)initWithAttributes:(NSDictionary *)attributes {
// ... init
_userID = [[attributes valueForKeyPath:#"id"] integerValue];
_username = [attributes valueForKeyPath:#"username"];
_avatarImageURLString = [attributes valueForKeyPath:#"avatar_image.url"];
return self;
}
This is how things appear in the AFNetworking basics example. But this looks like it might get cumbersome as soon as I'm returning objects with children. Is there an easier way for me to do this?
I saw that there is the AFIncrementalStore as referenced in this SO question. But I do not want any of the information retrieved to persist beyond the current session so I didn't think that was the right thing to do.
Thanks in advance for any help.

Resources