Core Data double-inserting child records in one-to-many association - ios

We have an iOS application that uses Core Data to persist records fetched from a private web API. One of our API requests fetches a list of Project records, each of which has multiple associated Location records. ObjectMapper is used to deserialize the JSON response, and we have a custom transformer that assigns the nested Location attributes to a Core Data association on the Project entity.
The relevant part of the code looks like this. It's executed within a PromiseKit promise (hence the seal), and we save first to a background context and then propagate to the main context that gets used on the UI thread.
WNManagedObjectController.backgroundContext.perform {
let project = Mapper<Project>().map(JSONObject: JSON(json).object)!
try! WNManagedObjectController.backgroundContext.save()
WNManagedObjectController.managedContext.performAndWait {
do {
try WNManagedObjectController.managedContext.save()
seal.fulfill(project.objectID)
} catch {
seal.reject(error)
}
}
}
The problem we're having is that this insert process is saving each Location record to the database twice. Strangely, the duplicated Location records don't have any association with their parent Project record. That is to say, if Location records are looked up with an NSFetchRequest, or if I run a query on the underlying SQLite database, I can see that there are two entries for each Location, but project.locations only returns one copy of each Location. The same (or very similar) process applied to other record types with the same structure also results in duplicates.
I've tried several things so far to narrow down the problem:
Inspected the API JSON - no duplicates.
Inspected the state of the project.locations property immediately before the Core Data write. No duplicate records are present prior to the objects being persisted, indicating that the deserializer and custom nested attributes transformer are working correctly.
Removed the block that propagates the changes to the main thread managed object context, in case this was causing the insert to occur twice. Still get duplicates with solely the write to the background context.
Run the app with com.apple.CoreData.ConcurrencyDebug 1 set. No exception is thrown in this process, confirming that it's not a thread safety issue of some kind.
Run the app with com.apple.CoreData.SQLDebug 1 set. I can see in the logs that Core Data is inserting exactly twice as many Location rows as expected into the underlying SQLite database.
Implemented a uniqueness constraint on the entity. This fixes the problem in terms of what data gets persisted, but will still throw an error unless an NSMergePolicy is set.
The last item in that list effectively solves the problem, but it's treating the symptom, not the cause. Data integrity is important for our application, and I'm looking to understand what the underlying problem might be, or other options I might pursue for investigating it further.
Thanks!

A year and eight months later, I finally got to the bottom of this bug when a similar issue occurred with a different set of records. The problem was that I was calling ObjectMapper on each Location object twice. I was using ObjectMapper's mapArray method within a custom ObjectMapper TransformType to deserialize and persist the Location records associated with each Project, which worked as follows:
let locations = Mapper<Location>().mapArray(JSONObject: value as AnyObject)
However, what I had overlooked is that I was also overriding the constructor for Location and calling ObjectMapper again there:
required public init?(map: Map) {
let entity = NSEntityDescription.entity(forEntityName: "Location", in: WNManagedObjectController.backgroundContext)
super.init(entity: entity!, insertInto: WNManagedObjectController.backgroundContext)
mapping(map: map)
}
The line mapping(map: map) was unnecessary, and proved to be the culprit. In a similar scenario with two levels of one-to-many associations, this had the somewhat amusing consequence of quadrupling (!) the records at the second level - their parents had been duplicated, each copy of which subsequently duplicated its children. This was what ultimately led me to the cause of the bug.

Related

Why are Core Data NSManagedObject faults fired upon deletion?

I'm trying to efficiently batch delete a lot of NSManagedObjects (without using an NSBatchDeleteRequest). I have been following the general procedure in this answer (adapted to Swift), by batching an operation which requests objects, deletes, saves and then resets the context. My fetch request sets includesPropertyValues to false.
However, when this runs, at the point where each object is deleted from the context, the fault is fired. Adding logging as follows:
// Fetch one object without property values
let f = NSFetchRequest<NSManagedObject>(entityName: "Entity")
f.includesPropertyValues = false
f.fetchLimit = 1
// Get the result from the fetch. This will be a fault
let firstEntity = try! context.fetch(f).first!
// Delete the object, watch whether the object is a fault before and after
print("pre-delete object is fault: \(firstEntity.isFault)")
context.delete(firstEntity)
print("post-delete object is fault: \(firstEntity.isFault)")
yields the output:
pre-delete object is fault: true
post-delete object is fault: false
This occurs even when there are no overrides of any CoreData methods (willSave(), prepareForDeletion(), validateForUpdate(), etc). I can't figure out what else could be causing these faults to fire.
Update: I've created a simple example in a Swift playground. This has a single entity with a single attribute, and no relationships. The playground deletes the managed object on the main thread, from the viewContext of an NSPersistentContainer, a demonstrates that the object property isFault changes from true to false.
I think an authoritative answer would require a look at the Core Data source code. Since that's not likely to be forthcoming, here are some reasons I can think of that this might be necessary.
For entities that have relationships, it's probably necessary to examine the relationship to handle delete rules and maintain data integrity. For example if the delete rule is "cascade", it's necessary to fire the fault to figure out what related instances should be deleted. If it's "nullify", fire the fault to figure out which related instances need to have their relationship value set to nil.
In addition to the above, entities with relationships need to have validation checks performed on related instances. For example if you delete an object with a relationship that uses the "nullify" delete rule, and the inverse relationship is not optional, you would fail the validation check on the inverse relationship. Checking this likely triggers firing the fault.
Binary attributes can have data automatically stored in external files (the "allows external storage" option). In order to clean up the external file, it's probably necessary to fire the fault, in order to know which file to delete.
I think all of these could probably be optimized away. For example, don't fire faults if the entity has no relationships and has no attributes that use external storage. However, this is looking from the outside without access to source code. There might be other reasons that require firing the fault. That seems likely. Or it could be that nobody has attempted this optimization, for whatever reason. That seems less likely but is possible.
BTW I forked your playground code to get a version that doesn't rely on an external data model file, but instead builds the model in code.
Tom Harrington has explained it best. CoreData's internal implementation apparently requires to fire fault when marking an object to be removed from the persistent store, just like it would if you were accessing a property of the object. As explained in this answer, "An NSManagedObject is always dynamically rendered. Hence, if it is deleted, Core Data faults out the data".
This seems to be the normal behaviour at least for the moment being, not really an issue.

Core Data fetch predicate nil check failing/unexpected results?

I have a Core Data layer with several thousand entities, constantly syncing to a server. The sync process uses fetch requests to check for deleted_at for the purposes of soft-deletion. There is a single context performing save operations in a performBlockAndWait call. The relationship mapping is handled by the RestKit library.
The CoreDataEntity class is a subclass of NSManagedObject, and it is also the superclass for all our different core data object classes. It has some attributes that are inherited by all our entities, such as deleted_at, entity_id, and all the boilerplate fetch and sync methods.
My issue is some fetch requests seem to return inconsistent results after modifications to the objects. For example after deleting an object (setting deleted_at to the current date):
[CoreDataEntity fetchEntitiesWithPredicate:[NSPredicate predicateWithFormat:#"deleted_at==nil"]];
Returns results with deleted_at == [NSDate today]
I have successfully worked around this behavior by additionally looping through the results and removing the entities with deleted_at set, however I cannot fix the converse issue:
[CoreDataEntity fetchEntitiesWithPredicate:[NSPredicate predicateWithFormat:#"deleted_at!=nil"]];
Is returning an empty array in the same conditions, preventing a server sync from succeeding.
I have confirmed deleted_at is set on the object, and the context save was successful. I just don't understand where to reset whatever cache is causing the outdated results?
Thanks for any help!
Edit: Adding a little more information, it appears that once one of these objects becomes corrupted, the only way get it to register is modifying the value again. Could this be some sort of Core Data index not updating when a value is modified?
Update: It appears to be a problem with RestKit https://github.com/RestKit/RestKit/issues/2218
You are apparently using some sintactic sugar extension to Core Data. I suppose that in your case it is a SheepData, right?
fetchEntitiesWithPredicate: there implemented as follows:
+ (NSArray*)fetchEntitiesWithPredicate:(NSPredicate*)aPredicate
{
return [self fetchEntitiesWithPredicate:aPredicate inContext:[SheepDataManager sharedInstance].managedObjectContext];
}
Are you sure that [SheepDataManager sharedInstance].managedObjectContext receives all the changes that you are making to your objects? Is it receives notifications of saves, or is it child context of your save context?
Try to replace your fetch one-liner with this:
[<your saving context> performBlockAndWait:^{
NSFetchRequest *request = [NSFetchRequest fetchRequestWithEntityName:#"CoreDataEntity"];
request.predicate = [NSPredicate predicateWithFormat:#"deleted_at==nil"];
NSArray *results = [<your saving context> executeFetchRequest:request error:NULL];
}];
First, after a save have you looked in the store to make sure your changes are there? Without seeing your entire Core Data stack it is difficult to get a solid understanding what might be going wrong. If you are saving and you see the changes in the store then the question comes into your contexts. How are they built and when. If you are dealing with sibling contexts that could be causing your issue.
More detail is required as to how your core data stack looks.
Yes, the changes are there. As I mentioned in the question, I can loop through my results and remove all those with deleted_at set successfully
That wasn't my question. There is a difference between looking at objects in memory and looking at them in the SQLite file on disk. The questions I have about this behavior are:
Are the changes being persisted to disk before you query for them again
Are you working with multiple contexts and potentially trying to fetch from a stale sibling.
Thus my questions about on disk changes and what your core data stack looks like.
Threading
If you are using one context, are you using more than one thread in your app? If so, are you using that context on more than one thread?
I can see a situation where if you are violating the thread confinement rules you can be corrupting data like this.
Try adding an extra attribute deleted that is a bool with a default of false. Then the attribute is always set and you can look for entities that are either true or false depending on your needs at the moment. If the value is true then you can look at deleted_at to find out when.
Alternatively try setting the deleted_at attribute to some old date (like perhaps 1 Jan 1980), then anything that isn't deleted will have a fixed date that is too old to have been set by the user.
Edit: There is likely some issue with deleted_at having never been touched on some entities that is confusing the system. It is also possible that you have set the fetch request to return results in the dictionary style in which case recent changes will not be reflected in the fetch results.

How to implement the new Core Data model builder 'unique' property in iOS 9.0 Beta

In the WWDC15 video session, 'What's New in Core Data' at 10:45 mins (into the presentation) the Apple engineer describes a new feature of the model builder that allows you to specify unique properties. Once you set the those unique properties, Core Data will not create a duplicate object with that property. This is suppose to eliminate the need to check if an identical object before you create a new object.
I have been experimenting with this but have no luck preventing the creation of new objects with identical 'unique' properties (duplicate objects). Other than the 5 minute video explanation, I have not found any other information describing how to use this feature.
Does anyone have any experience implementing the 'unique' property attribute in the Core Data Model?
Short answer:
You'll need to add this line to your Core Data stack setup code:
managedObjectContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
Long answer: I struggled with this for some time, but I think I have figured it out now:
Unique Constraints (UC) do not prevent creating duplicates in a context. Only when you try to save that context, Core Data checks for the uniqueness of the UCs.
If it finds more than one object with the same value for a UC, the default behaviour is to throw an error because the default merge policy for conflicts is NSErrorMergePolicyType. The error contains the conflicting objects in its userInfo.conflictList, so you could manually resolve the conflict.
But most of the time you probably want to use one of the other merge policies instead and let Core Data merge the conflicts automatically. These merge policies did exist before, they are used for conflicts between objects in different contexts. Maybe that's why they were not mentioned in the context of the UC feature at WWDC Session 220. Usually the right choice is NSMergeByPropertyObjectTrumpMergePolicy. It basically says "new data trumps old data", which is what you want in the common scenario when you import new data from external sources.
(Tip: First I had problems verifying this behaviour, because the duplicate objects seem to remain in the context until the save operation is finished - which in my case happened asynchronously in a background queue. So if you fetch/count your objects right after hitting the save button, you might still see the duplicates.)
I don't know the right answer, as this is a beta version, but after playing with it for a minute I found a way to make it work:
Tell the model which attributes form the unique constraint, exactly as shown in the image you have in your question.
Add a new record:
let newTag = NSEntityDescription.insertNewObjectForEntityForName("Tag", inManagedObjectContext: context) as! Tag
Assign the values to the attribues.
Save your changes:
do {
try context.save()
} catch let error as NSError {
print("Error: \(error.localizedDescription)")
context.reset()
}
The key is in the catch block. If an error happens, reset the context to the previous state. As the save operation failed, the duplicate records won't be there.
Please notice that you should check the error to see if it was caused by a duplicated record.
I hope this helps.

NSMangedObject attributes value missing, when using more than one context simultaneously

I am using three NSManagedObjectContexts (moc) as A, B, C (parent, child1, child2 respectively) for my project.
A(parent) is in private queue, only for saving after saving either of child moc saved
B(child1) is in main queue, is used for updating UI part
C(child2) is in private queue, is used for saving/updating data to core data from server response
Now my problem is when i am trying to load/populate a table with fetched data from core data using B, I miss attributes of entities. That means all attribute values becomes nil.
What I think happens is: I am saving data using context C and fetching data using B. Is it the reason for missing attributes?
I just ran into a similar situation where when trying to fetch from a child context would return the correct number of objects but all the properties would be nil. The culprit in my case was I had called
- (void)setPropertiesToFetch:(NSArray *)values
on my fetch request. Once I removed this line I got the properties populated. I was fetching NSManagedObjects and the documentation says:
This value is only used if resultType is set to NSDictionaryResultType.
So I would think it should've just been ignored, but in fact it breaks stuff. Oddly though if you leave that set properties call in and execute the fetch in the root context (the context that has no parent) then everything works normally. All this was in iOS 7.1

What's the point of self.managedObjectContext == nil in NSManagedObject prepareForDeletion?

I have a Reminder entity that needs to update its date property whenever a certain entity B is deleted. I've spent some days coding thinking I could do some useful things in my managed object subclass on deletion time. I tried
- (void)willSave
{
if (self.isDeleted)
// use self.managedObjectContext
}
The context was nil. Relationships were also torn down there. Fair enough.
So... I started writing cumbersome code for prepareForDeletion to circumvent the fact that the object hadn't been deleted yet, but then Core Data throws self.managedObjectContext == nil in my face. The documentation says that this is where I do stuff "before relationships are torn down". So what is the point in self.managedObjectContext == nil if self.relationshipA.managedObjectContext is accessible (as the docs suggest)? And more importantly, why does my not yet deleted object not have its context?
I read a comment here regarding that problem
its not 'fault' as much as it is a 'disown', the context has disowned your object (he was deleted and save was committed to the database) and so your object was disowned. don't save in methods that are changing and object as the save should probably be committed/saved after the operation anyway. – Dan Shelly May 21 at 19:05
My code was:
[moc deleteObject:obj]
[moc save:NULL]
When I removed the save operation my self.managedObjectContext existed in prepareForDeletion. That is, until auto-save, when it was nil again. Probably because the parent context also deleted it, followed by a save by the UIManagedDocument.
I'm starting to think that my only options are to make a custom delete method (that works until Core Data cascades a deletion, in which case it won't be called), or make a new class that listens to NSManagedObjectContextDidSaveNotification.
Update:
The user wants to keep in touch with a person, and wants to be reminded after a certain interval (stored in ContactWish) if no contact has been made. What I'm trying to accomplish is that when the latest ContactOccasion for a certain person is deleted, the corresponding occasion->person->wish->reminder gets updated (using the interval).
Since this is a learning experience for me I wanted to find out the right way (one that works with cascade deletion etc.) and not just call for an update manually from every place in my code where I do [MOContext deleteObject:occasion]. Suggestions are welcome.
(the reminder entity has also been prepared for more manual use)
Would it not be much more logical to have the Reminder entity manage its date property? It could "listen" (maybe via changedValues:) to its relationship entities being deleted and perform the update.
This seems more consistent, as the B entity should not really be concerned with the logic of the Reminder entity updates.
Edit
Pursuant to the discussion below and based on my opinion that you cannot load up the database cascade delete model too much with update logic:
Rather than react to a deletion you can introduce an attribute that you set and listen to in order to do the changes.
I really do not see how relying on core data delete mechanisms is easier or more elegant than just writing your own "deleteOccasion" method that handles this logic.

Resources