Core Data constraint with "includesPropertyValues" - ios

I would like to make a more efficient fetch on Core Data entities and I have a query.
I want to delete a large amount of records (millions).
My logic is:
fetch all records for the entity
delete all fetched records.
To improve fetching,
I set the following constraint:
fetch.includesPropertyValues = NO;
My question is: will the relationships (which are kept as properties in the managed objects) also be deleted?

Yes, if you delete a managed object, the relationship delete rules apply regardless of this flag.
With so many records you might also want to want to process the instances in batches. Use setFetchLimit: to get a subset of the instances, delete those, save changes, and repeat until no more instances are found.

Related

CoreData Selecting All Rows in Association One-by-one When Persisting to Disk

I'm using a parent-child MOC architecture as described by Marcus Zarra in his blog post and talk.
It's generally working great, but I have an ordered one-to-many relationship where the "many" accumulates a lot of records over time. The issue is, in the process of saving the private context to disk, CoreData runs a select query for what appears to be every single object in the association, one at a time, even if it hasn't been touched. As you can imagine, this is incredibly slow.
Any ideas on how to eliminate this or at least make it batch it into one query?
Ordered relationships are problematic for a number of reasons, but this is out of scope for this question.
One obvious solution attempt is to replicate the ordered property yourself by introducing your own attribute to keep track of the order. This is the path I have taken in all my projects where I had this use case. Your own ordering logic gives you much more granular control over the expensive processes, such as inserting an element into the series (rather than just appending it at the end).
Please note that in many applications the ordered property can be modeled differently, e.g. with a time stamp, which in some cases can spare you the necessity of modifying the whole chain.
As for the problem of using "one query": You could fetch all objects needing to be reordered, modifying their order (e.g. by adding them one by one to the parent object), and save.

Core data force to-many faults to all be fired?

I have a collection of CoreData objects that have a to-many relationship to another object type.
On some of these object I need to search through the related objects to find a particular one. So I loop through them looking for a match which works fine. But on closer inspection I can see CoreData firing off each fault off as it gets to each item in the loop, obviously this is no good - hundreds of faults fired off individually.
Can I trigger CoreData to fire all of the faults in that relationship as a group?
I don't want to just prefech the relationship in the first place because I am dealing with a very large number of objects and for almost all of them I won't ever need to drill down into the related objects.
You could use the inverse relationship to "manually" fetch the related objects, using a predicate to restrict the results. For example if Department has a to-many relationship to Employee and you want to fetch all the Employees for currentDepartment, the fetch might look like this:
NSFetchRequest *employeeFetch = [NSFetchRequest fetchRequestWithEntityName:#"Employee"];
employeeFetch.predicate = [NSPredicate predicateWithFormat:#"department == %#",currentDepartment"];
This will fetch the required Employee objects in one go (*). You could then use either the array returned by the fetch, or the set given by the currentDepartment.employees relationship to search through. Depending on the complexity of the search you are performing, you might even be able to express it as another clause in the predicate, and avoid the need to loop at all.
(*) Technically, the objects returned by the fetch will still be faults (unless you set returnsObjectsAsFaults to false), but the data for these faults has been fetched from the store into the cache, so firing the fault will now have minimal overhead.

CoreData doesn't keep the sequence while saveContext

I am trying to insert data in CoreData. I have may records to insert, this should be all or none. So I am creating instance of NSManagedObject and inserting it to NSManagedObjectContext one by one.
When I call below method after inserting all records:
[_myManagedObjectContext saveContext:&error];
This method save all inserted records to persistent store. When I open the Sqlite file generated by core data, I found all the records inserted by my app.
Problem is the order is not same. e.g I inserted records based on serial number 1-100 in sequense, I am able to see random sequence in CoreData sqlite file.
I know that I should not worry about the entries on core data sqlite file I can always fetch records in sorted order using NSPredicate but I need to keep the sequence because in some circumstances I need to study the database file.
Can someone tell me what to do to let saveContext method save records in the same order those are inserted to context?
The only way to reliably maintain order of your objects is to add an additional attribute on the object, such as 'index', and then set the value once the object has been created. Use this object to sort the results when you retrieve them or use the index to retrieve the objects in the required order.
The easiest way is to implement the NSManagedObject subclass's awakeFromInsert method and set a date property to NSDate(). Now you have a sequence which is accurate to 10,000th of a second. When you need items in order, just sort by this date property.

NSFetchRequest: FetchBatchSize and Faulting Behaviour

I am new with Core Data, so sorry if this is a stupid question.
Is there a way to set the fetchBatchSize property on an automatic fetch request generated by firing a fault by accessing an NSManagedObject relationship?
For instance, let's say I have a "Companies" entity and a "Employees" entity with a one-to-many relationship from "Companies" to "Employees". I make a fetch request to retrieve all the companies, then for one company I would like to load its employees.
The obvious way would be to do something like this :
NSSet *employees = [anyCompany employees];
But then, how do I set the fetchBatchSize property to ensure not to load too much data at the same time?
Thank you in advance.
The fetchBatchSize just defines how many records are going to be retrieved in one round trip to the persistent store. For example, if you have 1000 entries for one entity and your batch size is 20, a fetch request fetching all entries will actually execute 50 SQL statements.
It is clear that this is not very efficient depending on the context of your fetch. You can calibrate the fetch request with the batch size if memory becomes an issue, but in most cases you really do not have to care about it too much. Unnecessary multiple round trips to the store, however, will most likely affect performance.
So just use an expression like
aCompany.employees
liberally and let Core Data deal with the memory management. It will typically only retrieve the entities and attributes it actually needs for display or calculation.

How can you avoid inserting duplicate records?

I have a web service call that returns XML which I convert into domain objects, I then want to insert these domain objects into my Core Data store.
However, I really want to make sure that I dont insert duplicates (the objects have a date stamp which makes them unique which I would hope to use for the uniqueness check). I really dont want to loop over each object, do a fetch, then insert if nothing is found as that would be really poor on performance...
I am wondering if there is an easier way of doing it? Perhaps a "group by" on my objects in memory???? Is that possible?
Your question already has the answer. You need to loop over them, look for them, if they exist update; otherwise insert. There is no other way.
Since you are uniquing off of a single value you can fetch all of the relevant objects at once by setting the predicate:
[myFetchRequest setPredicate:[NSPredicate predicateWithFormat:#"timestamp in %#", myArrayOfIncomingTimestamps]];
This will give you all of the objects that already exist in a faulted state. You can then run an in memory predicate against that array to retrieve the existing objects to update them.
Also, a word of advice. A timestamp is a terribly uniqueID. I would highly recommend that you reconsider that.
Timestamps are not unique. However, we'll assume that you have unique IDs (e.g. a UUID/GUID/whatever).
In normal SQL-land, you'd add an index on the GUID and search, or add a uniqueness constraint and then just attempt the insert (and if the insert fails, do an update), or do an update (and if the update fails, do an insert, and if the insert fails, do another update...). Note that the default transactions in many databases won't work here — they lock rows, but you can't lock rows that don't exist yet.
How do you know a record would be a duplicate? Do you have a primary key or some other unique key? (You should.) Check for that key -- if it already exists in an Entity in the store, then update it, else, insert it.

Resources