So I am a bit confused about the amount of copies core data keeps for every managed object. First it stores a copy in the row cache which it uses to fulfill faults. Then for every object it also keeps a snapshot of the unmodified object as well as the actual data for the object (assuming its not a fault). That's 3 copies for one object so I assume I am misunderstanding something. This would mean migrations would need 4 times the size of the objects in the original database as it also has to create a new destination stack. I assume core data is smart and may do things like copy-on-write under the hood and not create a snapshot unless the object is actually modified.
Could someone please explain what is wrong about my thought process? Also is it true that the row cache will be shared by different managed object contexts created from the same persistent coordinator, or is there a row cache for each context?
Related
I want to know when to use below properties? What do they do? Why should we use it?
Transient: According to Apple Docs:
Transient attributes are properties that you define as part of the
model, but which are not saved to the persistent store as part of an
entity instance’s data. Core Data does track changes you make to
transient properties, so they are recorded for undo operations. You
use transient properties for a variety of purposes, including keeping
calculated values and derived values.
I do not understand the part that it is not saved to the persistent store as an entity instance's data. Can any one explain this?
indexed: It increase the search speed but at the cost of more space. So basically, if you do a search query using an attribute and you want faster result then make that property as 'indexed'. If the search operation is very rare then it decreases the performance as it take more space for indexing.
I am not sure whether it is correct or not?
index in spotlight
Store in External record file
Consider for instance that you have a navigation app. On your map you have your car at the center, updated a few dozen times a second, and an Entity of the type "GAS STATION". The entity's property 'distance' from your car would be a transient property, as it is a function of real time data thus there's no point to storing it.
An indexed attribute is stored sorted, therefore it can be searched faster. Explanation can be found on Wikipedia. If your frequent searches take noticeable time, you should probably consider indexing.
Consider indexing in Spotlight anything the user is likely to want to search for when not within your application. Documentation is here.
Large binary objects, like images, should be stored externally.
Question:
When you call save on NSManagedObjectContext when there are deletedObjects to be dealt with, it does a good deal of work tracking down the relationships to determine which related objects also need to be deleted. This is done with every save (even in the case of parent / child contexts), if I know I'll be saving right through those parent contexts, I would like to be able to skip this check for every context after the first, however I can't find the mechanism to do so.
Background:
Here is the situation, I have 3 contexts:
C (Child context of B): A background thread context for doing work
B (Child context of A): A Main thread context, for my Fetch Result Controllers, UI access
A (Child of Persistent Store): A background thread context for saving to disk
My object graph is "Rich" and some of the relationships are against tables that have 100,000+ items. Because of this, deletions are understandably slow (Cascade delete removes a number of objects with each root object deletion)... However when analyzing the performance I realized it's actually A LOT slower than it needs to be, during a save a majority of the CPU / time is used in resolving the relationships in the graph (To try and figure out which related objects also need to be deleted)... And it repeats this process EVERY save, even though I'm saving right through.
Example save process:
Context C has 1 deleted object
Save Context C -> B (around 1 second of work is being done tracking down the relationships)
Context C now saved (0 deleted objects), Context B has the original deleted object, and 30 new related objects that also needed to be deleted (31 Objects)
Save Context B -> A (Around 1 second of work rechecking those relationships, this is the work I'd like to skip)
Context B Saved (0 deleted objects), Context A has the same 31 objects that B had to delete
Save Context A -> Persistent Store (Light work here, seems that this step doesn't bother with a recheck)
I tried subclassing and looking into -[NSManagedObject validateForDeletion] and -[NSManagedObjectContext processPendingChanges] with no luck. Any thoughts?
It should be possible to selectively enable/disable a delete rule if you use multiple Core Data stacks and if you don't mind getting into some moderately complex code. You'll end up with multiple managed object contexts where some have the rule and some don't. The deal is:
An NSManagedObjectModel is mutable when you first load it, so you can modify the model in code as long as you don't make changes that would make it incompatible with your persistent store (which would require migration).
Model compatibility is determined only by details that affect how data is saved to the persistent store. Delete rules are not included, so they can be modified without requiring migration. (see the documentation for [NSRelationshipDescription versionHash] for what does affect compatibility.
Instances of NSManagedObjectModel are only mutable until you use them with a persistent store, so you have to make your changes before loading any data. This means that if you want the delete rule in some cases but not in others, you need different instances of NSManagedObjectModel.
So, if you create entirely independent Core Data stacks (separate instances of NSManagedObjectModel, NSPersistentStoreCoordinator, etc, everything) you could configure one stack to not have these delete rules but leave them in place on others. Since delete rules don't affect the model hash, they'd still be compatible.
You cannot do this with parent/child managed object contexts, since they by definition are part of the same Core Data stack.
To remove or change the delete rule, you'll have to start with your instance of NSManagedObjectModel. Get the entities in the model, then get relationshipsByName for the entity you're interested in. Finally, on the NSRelationshipDescription, change deleteRule to whatever you want (probably NullifyDeleteRule in this case).
This is going to be kind of complicated but it should work.
As an aside, I'd be interested in knowing more detail about what your performance analysis showed.
I am building an app with Objective-C and I would like to persist data. I am hesitating between NSKeyedArchiver and core Data. I am aware there are plenty of ressources about this on the web (including Objective-C best choice for saving data) but I am still doubtful about the one I should use. Here are the two things that make me wonder :
(1) I am assuming I will have around 1000-10000 objects to handle for a data volume of 1-10 Mb. I will do standard database queries on these objects. I would like to be able to load all these objects on launching and to save them from time to time -- a 1 second processing time for loading or saving would be fine by me.
(2) For the moment my model is rather intricate : for instance classA contains among other properties an array of classB which is itself formed by (among other) a property of type classC and a property of type classD. And class D itself contains properties of type classE.
Am I right to assume that (1) means that NSKeyedArchiver will still work fine and that (2) means that using core Data may not be very simple ? I have tried to look for cases where core Data was used with complex object graph structure like my case (2) on the web but haven't found that many ressources. This is for the moment what refrains me the most from using it.
The two things you identify both make me lean towards using CoreData rather than NSKeyedArchiver:
CoreData is well able to cope with 10,000 objects (if not considerably more), and it can support relatively straight-forward "database-like" queries of the data (sorting with NSSortDescriptors, filtering with NSPredicate). There are limitations on what can be achieved, but worst case you can load all the data into memory - which is what you would have to do with the NSKeyedArchiver solution.
Loading in sub-second times should be achievable (I've just tested with 10,000 objects, totalling 14Mb, in 0.17 secs in the simulator), particularly if you optimise to load only essential data initially, and let CoreData's faulting process bring in the additional data when necessary. Again, this will be better than NSKeyedArchiver.
Although most demos/tutorials opt for relatively straight forward data models (enough to demonstrate attributes and relationships), CoreData can cope with much more sophisticated data models. Below is a mock-up of the relationships that you describe, which took a few minutes to put together:
If you generate subclasses for all those entities, then traversing those relationships is simple (both forwards and backwards - inverse relationships are managed automatically for you). Again, there are limitations (CoreData does the SQL work for you, but in so doing it is less flexible than using a relational database directly).
Hope that helps.
What is the best strategy to synchronize Parse objects across the application?
Take Twitter as an example, they have many Tweet objects, same tweet object can be shown on multiple places, say viewController1 and viewController2, so it is not efficient for both of them to hold deep copies of the same parse object.
When I increase the likeCount of Tweet_168 in viewController2, how should I update the likeCount of Tweet_168 in viewController1?
I created a singleton container class (TweetContainer) so every Parse request goes through this and this checks if the incoming objectIds are already in the container,
A) if it is, it updates the previous object's fields and dumps the new object. (to keep single deep copy of a parse object.)
B) if it is not, it adds the new object.
(This process is fast as I'm using hashmaps)
This container holds deep copies to those objects, and gives shallow copies to viewControllers, thus editing a Tweet in a viewController will result in its update on all viewControllers!
Taking one step further, let's say Tweet objects have pointers to Author objects. When an Author object is updated, I want all of them to be updated (say image change). I can create a new AuthorContainer with the same strategy and give shallow copies to Tweet objects in TweetContainer.
I could, in an ideal world, propagate every update to cloud and refresh every object before showing to user over the cloud, but that's not feasible neither bandwidth nor latency-wise
I read the doc of course, but I don't quite get the meaning of "setting up any sections and ordering the contents".
Don't these kinds of information come form data base?
Does it mean NSFetchedResultsController needs some other kinds of indices itself beside data base indices?
What's really going on when NSFetchedResultsController is setting up a cache?
Is cache useful for static data only? If my data update frequently, should I use cache or not?
How can I profile the performance of cache? I tried cache, but couldn't see any performance improvement. I timed -performFetch: but saw a time increase from 0.018s(without cache) to 0.023s(with cache). I also timed -objectAtIndexPath: and only a time decrease from 0.000030(without cache) to 0.000029(with catch).
In other words, I want to know when cache does(or does not) improves performance and why.
As #Marcus pointed below, "500 Entries is tiny. Core Data can handle that without a human noticeable lag. Caching is used when you have tens of thousands of records." So I think there are few apps that would benefit from using cache.
The cache for the NSFetchedResultsController is a kind of short cut. It is a cache of the last results from the NSFetchRequest. It is not the entire data but enough data for the NSFetchedResultsController to display its results quickly; very quickly.
It is a "copy" of the data from the database that is serialized to disk in a format that is easily consumed by the NSFetchedResultsController on its next instantiation.
To look at it another way, it is the last results flash frozen to disk.
From the documentation of NSFetchedResultsController:
Where possible, a controller uses a cache to avoid the need to repeat work performed in setting up any sections and ordering the contents
To take advantage of the cache you should use sectioning or ordering of your data.
So if in initWithFetchRequest:managedObjectContext:sectionNameKeyPath:cacheName: you set the sectionNameKeyPath to nil you probably won't notice any performance gain.
From the documentation
The Cache Where possible, a controller uses a cache to avoid the need
to repeat work performed in setting up any sections and ordering the
contents. The cache is maintained across launches of your application.
When you initialize an instance of NSFetchedResultsController, you
typically specify a cache name. (If you do not specify a cache name,
the controller does not cache data.) When you create a controller, it
looks for an existing cache with the given name:
If the controller can’t find an appropriate cache, it calculates the
required sections and the order of objects within sections. It then
writes this information to disk.
If it finds a cache with the same name, the controller tests the cache
to determine whether its contents are still valid. The controller
compares the current entity name, entity version hash, sort
descriptors, and section key-path with those stored in the cache, as
well as the modification date of the cached information file and the
persistent store file.
If the cache is consistent with the current information, the
controller reuses the previously-computed information.
If the cache is not consistent with the current information, then the
required information is recomputed, and the cache updated.
Any time the section and ordering information change, the cache is
updated.
If you have multiple fetched results controllers with different
configurations (different sort descriptors and so on), you must give
each a different cache name.
You can purge a cache using deleteCache(withName:).