I am implementing a custom NSIncrementalStore subclass which uses a relational database for persistent storage. One of the things that I still struggle with is the support for optimistic locking.
(feel free to skip this lengthy description right to my question below)
I analyzed how Core Data's SQLite incremental store approaches this problem by examining SQL logs produced by it and came up with following conclusions:
Each entity table in the database has a Z_OPT column which indicates the number of times a particular instance of this entity (row) has been modified, starting from 1 (initial insertion).
Each time a managed object is modified, Z_OPT value in its corresponding database row is incremented.
The store maintains cache (referred to as row cache in Core Data docs) of NSIncrementalStoreNode instances, each having a version property equal to Z_OPT value returned by previous SELECT or UPDATE SQL query on managed object's row.
When a managed object is returned from NSManagedObjectContext (e.g. by executing NSFetchRequest on it), MOC creates snapshot of this object which contains this version number.
When the object is modified or deleted, Core Data makes sure that it has not been modified or deleted outside the context by comparing versions of cached row and object snapshot. All of this happen when -save: is called on the context that the object belongs to. If the versions are different then a merge conflict is detected and handled based on set merging policy.
When MOC is being saved, the -newValuesForObjectWithID:withContext:error: method is called for each modified/deleted object which in turn returns NSIncrementalStoreNode with version number. This version is then compared to snapshot's version and if they are different, the save fails with appropriate merge conflicts (at least with default merge policy).
This simple use case works properly with my store since -newValuesForObjectWithID:withContext:error: checks the row cache first which is enough if the object was concurrently modified in other context using the same store instance. If this is the case, then the cache contains updated row with higher version number which is enough to detect a conflict.
But how can I detect than the underlying database has been modified outside my store, possibly by other application or other store instance using the same database file? I know this is an unfrequent edge case but Core Data handles it properly and I would prefer to do the same.
Core Data's store uses SQL queries like these to update/delete object's row:
UPDATE ZFOO SET Z_OPT=Y, (...) WHERE (...) AND Z_OPT=X
DELETE FROM ZFOO WHERE (...) AND Z_OPT=X
where:
X - version number last known to the store (from cache)
Y - new version number
If such a query fails (no rows affected) the row is updated in store's cache and its version compared against the one previously cached.
My question is: how can a custom NSIncrementalStore inform Core Data that optimistic locking failure has occurred for some updated/deleted/locked objects? It is only the store that is able to tell that when it handles NSSaveChangesRequest passed to it its -executeRequest:withContext:error: method.
If the underlying database does not change under the store, then conflicts are detected since Core Data calls -newValuesForObjectWithID:withContext:error: on each modified/deleted/locked object prior to executing save changes request on the store. I was not able to find any way for NSIncrementalStore to inform Core Data that an optimistic locking failure has occurred after it started to handle the save request. Is there some undocumented way to do that? Core Data seems to throw some exception in that case which is then magically translated into failed save request with NSError listing all the conflicts. I am only able to mimic that partly by returning nil from -executeRequest:withContext:error: and creating the error message by myself. I think there must be a way to use the standard Core Data conflict handling mechanism in this scenario as well.
I realize that this is not an answer to you question, but I will try and give you my point of view on CoreData and correlation to Databases:
(1st level cache)
NSPesistentStoreCoordinator + NSPersistentStore == A single connection to the database
(2nd level cache)
NSManagedObjectContext == cache over the connection holding changes
So, to my understanding your issue is that you have multiple connections to your store, each making changes, but you have no central version control over your records.
Your store will receive a -executeRequest:withContext:error: with NSSaveRequestType
You will then be responsible to verify that the record versions match, if you find a conflict in the connection level (level 1) you report version mismatch between the context (level 2) and the coordinator.
you need to report version missmatch between your connection (level 1) and your store.
To be able to do this your store must report changes on it across all connections to it (ConnectionManager), or it might offer hooks to changes performed on it.
I'm no SQLite expert, but the SQLite API does have something to offer in that area:
update hook
commit hook
changes
total changes
(I have no experience in setting these kind of hooks, but if CoreData use them it will not show in the debug logs)
you can report these errors by setting the error pointer (NSError**) and setting its internal data to match the one that CoreData coordinator is setting (create merge conflict and set the information in them as needed)
Note that optimistic locking failure will only occur during -executeRequest:withContext:error:
(unless you have a rogue connection to the store, one that is not tracked by the manager.
To support this behaviour your manager might need to verify each record as it is committed for a save [huge performance cost] , or use some hooks into the changes recently made to records
)
To handle multiple connections to your store you might need to have a shared cache of NSIncrementalStoreNode, keyed by the store url:
static #{
url1 : actualCacheMapping1,
url2 : actualCacheMapping2,
...
}
each connection save to the store will be verified agains the store url actual cache.
Hope this make some sense for you.
My question is: how can a custom NSIncrementalStore inform Core Data that optimistic locking failure has occurred for some updated/deleted/locked objects? It is only the store that is able to tell that when it handles NSSaveChangesRequest passed to it its -executeRequest:withContext:error: method.
In an NSIncrementalStore, NSIncrementalStoreNodes represent the store snapshots. The version property of the node is the optimistic locking primitive. The persistent store is responsible for detecting optimistic locking failures in at the store level, while the managed object context can detect them higher up. An optimistic locking failure at the store level might happen if the system the store is talking to was changed by something else, and there is a conflict between that system's state and that representation of state in the persistent store. For example, if the store was communicating with a web service and the web service data was changed by another user, etc.
If an optimistic locking failure is detected in your store implementation during a save, your store is responsible for creating NSMergeConflict objects describing it. These will be propagated up by the NSPersistentStoreCoordinator.
[[NSMergeConflict alloc] initWithSource:managedObject newVersion:newVersion oldVersion:oldVersion cachedSnapshot:inMemorySnapshot persistedSnapshot:storedSnapshot];
Snapshot dictionaries should include all modelled attribute property names as keys along with their values. This does not include relationships. For some stores, using the values from the reference objects or NSIncrementalStoreNodes may suffice as long as they only include the modelled attribute property name as keys (and those are easy to get from the entity description).
Once these objects have been created, create an NSError in the NSCocoaErrorDomain with the code NSPersistentStoreSaveConflictsError. The userInfo object should contain the key NSPersistentStoreSaveConflictsErrorKey which should contain an array of the NSMergeConflict objects. Return that from the save request, and the NSPersistentStoreCoordinator will be responsible for finding resolution. Rememeber, you should not generate merge conflicts for conflicts between the state of objects in the NSManagedObjectContext and your store, only for conflicts between whatever in-memory or cached state in your store and where ever the data is kept or persisted (like a web service, or database, etc.)
Related
I'm trying to efficiently batch delete a lot of NSManagedObjects (without using an NSBatchDeleteRequest). I have been following the general procedure in this answer (adapted to Swift), by batching an operation which requests objects, deletes, saves and then resets the context. My fetch request sets includesPropertyValues to false.
However, when this runs, at the point where each object is deleted from the context, the fault is fired. Adding logging as follows:
// Fetch one object without property values
let f = NSFetchRequest<NSManagedObject>(entityName: "Entity")
f.includesPropertyValues = false
f.fetchLimit = 1
// Get the result from the fetch. This will be a fault
let firstEntity = try! context.fetch(f).first!
// Delete the object, watch whether the object is a fault before and after
print("pre-delete object is fault: \(firstEntity.isFault)")
context.delete(firstEntity)
print("post-delete object is fault: \(firstEntity.isFault)")
yields the output:
pre-delete object is fault: true
post-delete object is fault: false
This occurs even when there are no overrides of any CoreData methods (willSave(), prepareForDeletion(), validateForUpdate(), etc). I can't figure out what else could be causing these faults to fire.
Update: I've created a simple example in a Swift playground. This has a single entity with a single attribute, and no relationships. The playground deletes the managed object on the main thread, from the viewContext of an NSPersistentContainer, a demonstrates that the object property isFault changes from true to false.
I think an authoritative answer would require a look at the Core Data source code. Since that's not likely to be forthcoming, here are some reasons I can think of that this might be necessary.
For entities that have relationships, it's probably necessary to examine the relationship to handle delete rules and maintain data integrity. For example if the delete rule is "cascade", it's necessary to fire the fault to figure out what related instances should be deleted. If it's "nullify", fire the fault to figure out which related instances need to have their relationship value set to nil.
In addition to the above, entities with relationships need to have validation checks performed on related instances. For example if you delete an object with a relationship that uses the "nullify" delete rule, and the inverse relationship is not optional, you would fail the validation check on the inverse relationship. Checking this likely triggers firing the fault.
Binary attributes can have data automatically stored in external files (the "allows external storage" option). In order to clean up the external file, it's probably necessary to fire the fault, in order to know which file to delete.
I think all of these could probably be optimized away. For example, don't fire faults if the entity has no relationships and has no attributes that use external storage. However, this is looking from the outside without access to source code. There might be other reasons that require firing the fault. That seems likely. Or it could be that nobody has attempted this optimization, for whatever reason. That seems less likely but is possible.
BTW I forked your playground code to get a version that doesn't rely on an external data model file, but instead builds the model in code.
Tom Harrington has explained it best. CoreData's internal implementation apparently requires to fire fault when marking an object to be removed from the persistent store, just like it would if you were accessing a property of the object. As explained in this answer, "An NSManagedObject is always dynamically rendered. Hence, if it is deleted, Core Data faults out the data".
This seems to be the normal behaviour at least for the moment being, not really an issue.
Based on some limited testing, I see that if I
Execute a Fetch request with result type = NSDictionaryResultType
Do some manipulations on the returned values
Store back the MOC on which Fetch request was executed
the changes in step 2 are not written back to the persistent store because I am changing a dictionary and not a "managed object". Is that a correct understanding?
Most likely you are abusing the dictionary result type. Unlike in conventional database programming, you are not wasting valuable memory resources when fetching the entire objects rather than just one selected attributes, due to an under-the-hood mechanism called "faulting".
Try fetching with managed object result type (default) and you can very easily manipulate your objects and save them back to Core Data. You would not need to do an additional fetch just to get the object you want to change.
Consider dictionaries only in special situations with huge data volumes, difficult relational grouping logic, etc., which make it absolutely necessary.
(That being said, it is unlikely that it is ever absolutely necessary. I have yet to encounter a case where the necessity of dictionaries for fetches was not an indirect result of flawed data model design.)
Yes, kind of, you can't store a dictionary back into the context directly so you can't save any updates that way.
If you get a dictionary object then you need to include in it the associated managed object id (if it isn't aggregated) or do another fetch to get the object(s) to update.
I have a Core Data layer with several thousand entities, constantly syncing to a server. The sync process uses fetch requests to check for deleted_at for the purposes of soft-deletion. There is a single context performing save operations in a performBlockAndWait call. The relationship mapping is handled by the RestKit library.
The CoreDataEntity class is a subclass of NSManagedObject, and it is also the superclass for all our different core data object classes. It has some attributes that are inherited by all our entities, such as deleted_at, entity_id, and all the boilerplate fetch and sync methods.
My issue is some fetch requests seem to return inconsistent results after modifications to the objects. For example after deleting an object (setting deleted_at to the current date):
[CoreDataEntity fetchEntitiesWithPredicate:[NSPredicate predicateWithFormat:#"deleted_at==nil"]];
Returns results with deleted_at == [NSDate today]
I have successfully worked around this behavior by additionally looping through the results and removing the entities with deleted_at set, however I cannot fix the converse issue:
[CoreDataEntity fetchEntitiesWithPredicate:[NSPredicate predicateWithFormat:#"deleted_at!=nil"]];
Is returning an empty array in the same conditions, preventing a server sync from succeeding.
I have confirmed deleted_at is set on the object, and the context save was successful. I just don't understand where to reset whatever cache is causing the outdated results?
Thanks for any help!
Edit: Adding a little more information, it appears that once one of these objects becomes corrupted, the only way get it to register is modifying the value again. Could this be some sort of Core Data index not updating when a value is modified?
Update: It appears to be a problem with RestKit https://github.com/RestKit/RestKit/issues/2218
You are apparently using some sintactic sugar extension to Core Data. I suppose that in your case it is a SheepData, right?
fetchEntitiesWithPredicate: there implemented as follows:
+ (NSArray*)fetchEntitiesWithPredicate:(NSPredicate*)aPredicate
{
return [self fetchEntitiesWithPredicate:aPredicate inContext:[SheepDataManager sharedInstance].managedObjectContext];
}
Are you sure that [SheepDataManager sharedInstance].managedObjectContext receives all the changes that you are making to your objects? Is it receives notifications of saves, or is it child context of your save context?
Try to replace your fetch one-liner with this:
[<your saving context> performBlockAndWait:^{
NSFetchRequest *request = [NSFetchRequest fetchRequestWithEntityName:#"CoreDataEntity"];
request.predicate = [NSPredicate predicateWithFormat:#"deleted_at==nil"];
NSArray *results = [<your saving context> executeFetchRequest:request error:NULL];
}];
First, after a save have you looked in the store to make sure your changes are there? Without seeing your entire Core Data stack it is difficult to get a solid understanding what might be going wrong. If you are saving and you see the changes in the store then the question comes into your contexts. How are they built and when. If you are dealing with sibling contexts that could be causing your issue.
More detail is required as to how your core data stack looks.
Yes, the changes are there. As I mentioned in the question, I can loop through my results and remove all those with deleted_at set successfully
That wasn't my question. There is a difference between looking at objects in memory and looking at them in the SQLite file on disk. The questions I have about this behavior are:
Are the changes being persisted to disk before you query for them again
Are you working with multiple contexts and potentially trying to fetch from a stale sibling.
Thus my questions about on disk changes and what your core data stack looks like.
Threading
If you are using one context, are you using more than one thread in your app? If so, are you using that context on more than one thread?
I can see a situation where if you are violating the thread confinement rules you can be corrupting data like this.
Try adding an extra attribute deleted that is a bool with a default of false. Then the attribute is always set and you can look for entities that are either true or false depending on your needs at the moment. If the value is true then you can look at deleted_at to find out when.
Alternatively try setting the deleted_at attribute to some old date (like perhaps 1 Jan 1980), then anything that isn't deleted will have a fixed date that is too old to have been set by the user.
Edit: There is likely some issue with deleted_at having never been touched on some entities that is confusing the system. It is also possible that you have set the fetch request to return results in the dictionary style in which case recent changes will not be reflected in the fetch results.
i am new in core data and i created 2 tables,Night and Session. i manage to create new object of Night and new object for Session. when i try this code:
Session * session = [NSEntityDescription insertNewObjectForEntityForName:#"Session" inManagedObjectContext:[[DataManager sharedManager] managedObjectContext]];
Night * night = [NSEntityDescription insertNewObjectForEntityForName:#"Night" inManagedObjectContext:[[DataManager sharedManager] managedObjectContext]];
night.sessions = [NSSet setWithObject:session];
the session is getting into the night and the cool thing is, when i Fetch this night and can get the session for the night using:
currentNight.Seesion
But i can't see this link in the DB tables :(
UPDATE:
I mean when i write night.sessions = [NSSet setWithObject:session]; i need to see in the table DB (yes in the DB.sqlite file).
i thought that i should see some thing there ...
Core Data is not a relational Database.It makes structure of their own.It defines the Database tables structure according to your Managed Objects.For debugging you can see what queries core data is firing on sqlite.This will show you how core data is getting data from these two tables.
You have to go Product -> Edit Scheme -> Then from the left panel select Run yourApp.app and go to the main panel's Arguments Tab.
There you can add an Argument Passed On Launch.
You should add -com.apple.CoreData.SQLDebug 1
Press OK and your are all set.
Than next time it will show all the queries it running to fetch data from your tables.
It's not clear to me what your question is. But:
A context is a scratchpad. Its contents will not be moved to the persistent store until you -save:. If you drop into the filing system and inspect your persistent store outside of your app without having saved, your changes will not be recorded there.
For all of the stores the on-disk format is undefined and implementation dependent. So inspecting them outside of Core Data is not intended to show any specific result.
Anecdotally, if you're using a SQLite store then you should look for a column called Z_SESSIONS or something similar. It'll be a multivalued column. Within it will be the row IDs of all linked sessions. Core Data stores relationships with appropriately named columns and direct row IDs, which are something SQLite supplies implicitly. It does not use an explicit foreign/primary key relationship.
To emphasise the point: that's an implementation-specific of Core Data. It's not defined to be any more reliable than exactly what ARM assembly LLVM will spit out for a particular code structure. It's as helpful to have a sense of it as to know about how the CPU tends to cache, to branch predict, etc, but you shouldn't expect to be able to take the SQLite file and use it elsewhere, or in any way interact with it other than via Core Data.
I'm porting some iOS persistence functionality to Android and trying to understand save(), in order to replicate the functionality in Android (pure SQLite).
Documentation says:
save:
Attempts to commit unsaved changes to registered objects to the receiver’s parent store.
Doesn't help a lot.
I know that iOS uses SQLite so this has to translate to SQLite somehow.
Looks like save is an upsert - will insert the data if not there yet, and otherwise update.
If this is true (also if not, if the question is still valid) - how is determined which row to update? I don't see how to add unique in xcode, so if I have e.g:
id | name | price
1 | apple | 2.0
2 | lemon | 1.0
with "id" being the internal row id,
and I get new model data "lemon" -> 3.0, when I update the moc, how does the database know that it has to update this row?:
2 | lemon | 1.0
In SQlite I would add a unique on the name, but I don't know how it's implemented in iOS.
I'm not an iOS dev, sorry for possibly super -ignorant or -strange question.
Thanks.
It is really difficult to discuss Core Data in terms of databases because it is not a database. It uses one to persist data but that is just about it.
Looks like save is an upsert - will insert the data if not there yet, and otherwise update.
An NSManagedObjectContext is the current state of not just one object (or row in database terms) but multiple. So when you ask the NSManagedObjectContext to 'save' it is saving the state of all the objects in the context. If an object is new, it will be the equivalent of an insert. If the object already exists, it will be the equivalent of an update. However, if at some point an object is deleted, the 'save' method will also remove the object from the SQLite database. The 'save' method specifically saves the state of the NSManagedObjectContext.
If this is true (also if not, if the question is still valid) - how is
determined which row to update? I don't see how to add unique in xcode
That is because Core Data handles the unique identity of objects. There is no default 'id' column to place a unique identifier. However, you can create an attribute (i.e. column/field) to hold a unique identifier if the database will be persisted across many devices, which I personally had to do at one time since the 'objectID' is not practical to use. In Android, you will have to maintain the unique identity of each row yourself unless you opt to use auto incrementation.
when I update the moc, how does the database know that it has to
update this row?
At one point or another, you ask the NSManagedObjectContext to insert a new "Entity" (i.e. table):
NSManagedObject *managedObject = [NSEntityDescription insertNewObjectForEntityForName:#"EntityName" inManagedObjectContext:managedObjectContext];
To update an entity, you could retrieve it by using:
NSManagedObject *managedObject = [managedObjectContext objectWithID:managedObject.objectID];
Make any adjustments and then 'save' the NSManagedObjectContext. The objectID is its unique identifier that was automatically assigned when inserted. Core Data handles the boiler plate code of inserting and updating rows so you end up with an abstract version as seen in the examples. If you save a few NSManagedObjects and open the SQLite file, you will find that it is very similar to any other database, other than a few Core Data specific fields that is uses for management.
I would suggest creating a new Master Detail Application project, run it in the simulator, save a couple entries, and open the SQLite file. You can find it in
/Users/<username>/Library/Application Support/iPhone Simulator/<iOS Version>/Applications/<Application UDID>/Documents/
Opening the SQLite file will show you that the database Core Data maintains is very similar to any other SQLite database and may help out with understanding the processes.
I don't know the following to be true, but I think I'm not far off.
An NSManagedObjectContext has a reference to objects (NSManagedObject) that are composed using the data from the SQLite database. These objects all have the objectID property, which is a unique identifier to the row in the SQLite database allowing you to uniquely, even between contexts, identify an object/row. When you change an object's property, this doesn't actually change anything in the database. The context knows about the changes, and when you call save:, it will go to the database and update all the records.
This is always an UPDATE, as you have to call -[NSEntityDescription insertNewObjectForEntityForName:InManagedObjectContext] to get a reference to an object. At that point, a record is already inserted and it is given an objectID.
NSManagedObjectContext is kind of a representation of the data model. It is from the framework called CoreData. By using CoreData, we do not manipulate the SQLite database directly. Which means we do not write any SQL queries, we just do all the update, insert or delete on NSManagedObjectContext. And when we call save(), NSManagedObjectContext will tell the database which row was updated, which row was deleted or which row was inserted. And here is another question which might help you to understand more about NSManagedObjectContext.