Efficient CoreData fetches using subqueries - ios

I've got a iOS Core Data performance problem.
Let's assume that we have 2 classes: "Class_A" and "Class_B". Both of them have their own ids and one-to-one relations between each other.
Now let's assume that I'm downloading data from the web that allows me to create these classes (data contains id's of Classes A and B, and informations about their relations). For example:
"There will be instance of Class_A with id=1"
"There will be instance of Class_A with id=2"
"There will be instance of Class_A with id=3"
"There will be instance of Class_B with id=10"
"There will be instance of Class_B with id=11"
"There will be instance of Class_B with id=12"
"Class_A with id=1 will be connected with Class_B with id = 12"
"Class_A with id=2 will be connected with Class_B with id = 11"
"Class_A with id=3 will be connected with Class_B with id = 10"
Because all these informations can be obtained in random order (for example information about connections between X and Y classes can be downloaded before info about existance of classes X and Y), I've used another kind of entity: Relationship.
Relationship contains of 2 fields: class_A_id and class_b_id
Every time I recieve data about relationship between ClassA and ClassB, I create instance of Relationship entity with according properties values.
Than every certain period of time I'm iterating over all instances of "Relationship" entities, and trying to create proper relations like this:
Fetch all instances of "Relationship" entity
For every fetched Relationship fetch instance of ClassA and ClassB according to "class_A_id" and "class_B_id" ids stored in Relationsip
If both - instance of ClassA and ClassB exists - create relationship beetween them.
Remove instance Relationship from the CoreData
After implementing this algorithm I've figured out it works well when using with rather low quantities of objects in Core Data.
The problem is that i'm storing thousands of "relationship" entities (informations about them are downloaded first) while informations about existance of "ClassA" and "ClassB" are downloaded less frequently.
The result is that every time I'm trying to create relationships using "Relationship" entities I'm fetching thousands of objects that contains id's of classes that don't exists!
So my solution to this problem is to enhance first step of proposed algorithm:
Instead of fetching all relationships in the CoreData, fetch only these that contains id's of Classes that exists in the system.
In SQL it would probably looks something like this:
SELECT * from 'Relationship' where
(SELECT * from 'ClassA' where id == class_A_id).count == 1
AND
(SELECT * from 'ClassB' where id == class_B_id).count == 1
My question is - how to achieve query like this in CoreData? Use subqueries there? Is yes - how? :)

If I understood correctly, if you don't want to modify too much your model, you have two options:
First option: modify the request process to only ask for relationship about entities you already have, this can be done easily if you have not so much entities of type A,B. Otherwise implement some sort of 'delta', this will avoid to have thousands of relationship you don't need (unless you need it for other purpose).
Second option: when you download items, I suppose it is a background process, in this case I would do the check immediately within there, and not batch from time to time. For example, you are downloading relationship, check if you already have A or B and mark the relationship table as complete or incomplete (maybe adding two attributes aExists bExists). As soon as entity A came in, go and mark all those entity in relationship as aExists, the same goes for B coming in, this is a normal fetch request where class_A_id exists. Then the batch can easily scan and get only those entities marked as complete, and create relationship. You have to implement a little logic, but in my opinion things would get faster.
New relationship would be:
Relationships {
class_A_id,
class_B_id,
classAexists BOOL,
classBexists BOOL,
batchRelationshipCreated BOOL,
}
Well you can implement both options anyway :-)

Related

Core Data model - entities and inverses

I'm new to Core Data and I'm trying to implement it into my existing project. Here is my model:
Now, there's some things that don't make sense to me, likely because I haven't modelled it correctly.
CMAJournal is my top level object with an ordered set of CMAEntry objects and an ordered set of CMAUserDefine objects.
Here's my problem:
Each CMAUserDefine object has an ordered set of objects. For example, the "Baits" CMAUserDefine will have an ordered set of CMABait objects, the "Species" CMAUserDefine will have an ordered set of CMASpecies objects, etc.
Each CMAEntry object has attributes like baitUsed, fishSpecies, etc. that point to an object in the respective CMAUserDefine object. This is so if changes are made, each CMAEntry that references that object is also changed.
Now, from what I've read I should have inverses for each of my relationships. This doesn't make sense in my model. For example, I could have 5 CMAEntry objects whose baitUsed property points to the same CMABait object. Which CMAEntry does the CMABait's entry property point to if there are 5 CMAEntry objects that reference that CMABait? I don't think it should point to anything.
What I want is for all CMAUserDefine objects (i.e. all CMABait, CMASpecies, CMALocation, etc. objects) to be stored in the CMAJournal userDefines set, and have those objects be referenced in each CMAEntry.
I originally had this working great with NSArchiving, but the archive file size was MASSIVE. I mean, 18+ MB for 16 or so entries (which included about 20 images). And from what I've read, Core Data is something I should learn anyway.
So I'm wondering, is my model wrong? Did I take the wrong approach? Is there a more efficient way of using NSArchiver that will better fit my needs?
I hope that makes sense. Please let me know if I need to explain it better.
Thanks!
E: What lead me to this question is getting a bunch of "Dangling reference to an invalid object." = "" errors when trying to save.
A. Some Basics
Core Data needs a inverse relationship to model the relationship. To make a long story short:
In an object graph as modeled by Core Data a reference semantically points from the source object to a destination object. Therefore you use a single reference as CMASpecies's fishSpecies to model a to-one relationship and a collection as NSSet to model a to-many relationship. You do not care about the type of the inverse relationship. In many cases you do not have one at all.
In a relational data base relationships are modeled differently: If you have a 1:N (one-to-many) relationship the relationship is stored on the destination side. The reason for this is, that in a rDB every entity has a fixed size and therefore cannot reference a variable number of destinations. If you have a many-to-many relationship (N:M), a additional table is needed.
As you can see, in an object graph the types of relationships are to-one and to-many only depending on the source, while in rDB the types of relationships are one-to-one, one-to-many, many-to-many depending on both source and destination.
To select the right kind of rDB modeling Core Data wants to know the type of the inverse relationship.
Type Object graph Inverse | rDB
1:1 to-one id to-one id | source or destination attribute
1:N collection to-one id | destination attribute
N:M collection collection | additional table with two attributes
B. To your Q
In your case, if a CMAEntry object refers exactly one CMASpecies object, but a CMASpecies object can be referred by many CMAEntry objects, this simply means that the inverse relationship is a to-many relationship.
Yes, it is strange for a OOP developer to have such inverse relationships. For a SQL developer, it is the usual case. Developing an ORM (object relational mapper) this is one of the problems. (I know that, because I'm doing that for Objective-Cloud right now. But I did if different, more the OOP's point of view.) Every solution is a kind of unusual for one side. Somebody called ORM the "vietnam of software development".
To have a more simple example: Modeling a sports league you will find yourself having a entity Match with the properties homeTeam and guestTeam. You want to have an inverse relationship, no not homeMatches and guestMatches, but simply matches. This is obviously no inverse. Simply add inverse relationship, if Core Data wants and don't care about it.

Reason for setting relationships among entities in Core Data

After learning about relationships between entities in Core Data. I don't see the real reason for setting up relationship between two entities. They can be connected separately if one of the entities contains a property that can hold another entity by having a property of type NSManagedObject.
#property (nonatomic, strong ) NSManagedObject *AssetType;
This is a concept you must understand: Core Data is not a database but it is an object graph manager and, as a second functionality, offers persistence (e.g using for example a Sqlite store).
Said this, if you have two separated entities and you need to retrieve values based on the conditions that belong to the other entity, you need to run two requests and filter the results in memory. On the contrary if you set up a relationship you can just create a request wih a specific predicate and let Core Data to retrieve the correct results for you. In addition, through relationships you can access objects that belong to another entity as simple as accessing a property object. For example, the following snippet says that based on entityA I can access a property calles someRelationship that allows to retrieve one (or more) entities of type EntityB. If someRelationship has been set up as to-many you'll receive one or more EntityB entities.
entityB = entityA.someRelationship;
The real advice is to think in terms of objects graph!!!
Further reference: Core Data Overview by objc.io.
Update 1
The other big advantage is that relationships allow you to take advantage of deletion rules and, through inverse relationships, you are able to maintain the integrity of your graph.
See Relationships and Fetched Properties.

Performing the equivalent of a union with Core Data for a UITableViewController

I know union is a SQL construct, but it's the best analogue for what I'm trying to do.
I have multiple groups of data that I'm receiving from an external source. I'm maintaining them as separate entities in Core Data (they only have some attributes in common (e.g. name)), but I want to present them in the same tableView.
Say I have an entity Food that has relationships with FruitGroup and VegetableGroup. The FruitGroup has a relationship with Fruit which has a relationship with FruitType. The VegetableGroup is similar.
How can I use FruitGroup.Fruit.name and VegetableGroup.Vegetable.name as sectionTitles? And FruitGroup.Fruit.FruitType.name and VegetableGroup.Vegetable.VegetableType.name for row data. (I tried coming up with a predicate that walks down from Food, but that doesn't appear to be workable)
Example modeled data (my groups are far more disparate than fruits and veggies, so re-doing my data model is not an option):
Food
FruitGroup
Apple
Macintosh
Granny Smith
Pear
Bartlett
Asian
Anjou
VegetableGroup
Asparagus
white
wild
Peas
baby
split
Which I would like to appear as:
Apple [section]
Macintosh [row]
Granny Smith
Pear
Bartlett
Asian
Anjou
Asparagus
white
wild
Peas
baby
split
I could use multiple NSFetchedResultsControllers in the UITableViewController and conditionally select the FRC within each of the UITableViewDataSource methods, but that doesn't feel clean.
I'm thinking about subclassing NSFetchedResultsController and, internal to my subclass, merging the results of multiple private NSFetchedResultsControllers that each represent one of the entities. (e.g. sections returns a concatenation of the returns from the sections calls of the internal FRCs)
Does this make sense - or is there a better way? (I saw Core Data Union Query Equivalent but since there are relationships among my entities, I wanted to seek alternatives)
While you can do this as described in the other answers (via creating an abstract Parent entity), I would not recommend it. The performance when it comes to dealing with abstract parents gets bad very quickly. The reason for this is that Core Data will put all of the children into a single table in the underlying SQLite file.
I would suggest going a different route. Have a single entity called Food with attributes describing if it is a vegetable or fruit. Then you have one NSFetchedResultsController which has the type of the food item as the sectionPath and you will get your display the way that you want it.
I recommend creating entities in Core Data based on what the objects are as a very loose level. I would not create entities for Honda, Ford and Dodge, but create an entity for Car and perhaps type or a relationship to a manufacturer.
While Core Data can be backed by a database, at the end of the day it is not a database but an object graph and should be treated as such. Trying to normalize the database will result in poor performance of the object graph.
You should probably look into abstract entities. For example, you could create an abstract entity called Food. Then you're able to create Fruit and Vegetables, which inherits the abstract entity. You'll have to set Food as the "Parent Entity".
Then you could fetch all the items with the entity Food, which includes both Fruit and Vegetables. Based on your post, you'll probably will have a relation from Food to FoodGroup.
To answer your question:
You cannot unify different entity types (if they are not subclasses of the same entity) under a single fetch request. You can define an entity (B) to inherit from another entity (A) and then fetch by the parent entity (A) and get both kind of entities (As and Bs)
You can try and think of it this way:
Item ("Macintosh","White Asparagus",...) has a relationship to Group ("Apple","Asparagus",...), and Group has a relationship to Area (or simply to another parent group).
In this manner you could use a single FRC with sectionNameKeyPath of "group.name" and entity Item (you can filter by "group.area" to only select food items).

Core Data cascade on Data Inserts

I'm currently learning to use Core Data on iOS , in my test application I have two entities with an inverse relation, the delete cascade is working fine but i wonder if it is possible to have a update or insert cascade as well? for example if I create a new instance of entity 1 i want some of its attributes to be copied onto a new object of entity 2.
Do I have to write some code for this or is there some built in solution?
searching the internet gave me no results.
(also since I'm new to Core Data i'm thinking in terms of tables as my persistent store is of SQLite so an insert into one table must essentially copy a few attributes into another table)
Try to think of it in a different way. If those two objects share those properties, perhaps it would be best to create another entity who contains those fields and entity 1 and entity 2 would both have a common relationship to. Having multiple copies of the same data just doesn't seem like a good idea where it can be avoided.
(You haven't mentioned multiplicity of the relationship, which could be important.)
Not sure if this directly addresses your question, but …
If you have A <--> B. (1-to-1 relationship)
Cascade rules:
A cascades: B
B nils: A
(this is an A "owns" B description)
(above A/B == entities, below A/B == instances of entities)
if A(1) -> B(2)
and then you set
A(3) -> B(2)
B(2)'s reverse relationship to A(1) is nil'd out before it's set to A(3)
A(1) is left with a nil value (if that's not valid in the data model description, you're now in trouble, otherwise, it's B-less)
A(1) -> <nil>

Traversing one-to-many relationships with NSFetchedResultsControllers

I am creating an app that navigates through multiple levels of one-to-many relationships. So for example, pretend that the CoreDataBooks code sample starts with a list of genres, you click on a genre and then get the list of books organized by author as seen in Apple's code sample.
Here is my problem: the Apple documentation tells me I should use a FetchedResultsController to help organize my list of books into sections (among other reasons). But when trying to figure out how to get from "one" genre to my "many" books, the Core Data FAQ tells not to use a fetch. From the FAQ:
I have a to-many relationship from Entity A to Entity B. How do I fetch the instances of Entity B related to a given instance of Entity A?
You don’t. More specifically, there is no need to explicitly fetch the destination instances, you simply invoke the appropriate key-value coding or accessor method on the instance of Entity A.
The problem, of course, is I now have my books in a set, but I want them to get them from a fetched results controller.
What is the best way to proceed here? Should I follow the FAQ, and if so, how do I manage dividing my books up into sections by author?
Or do I use a fetched results controller (which I suspect is better), in which case how do I traverse the one-to-many relationship (since Apple's oh-so-helpful answer is simply "don't")?
Many thanks for your help.
Sasha
You have a data model that looks roughly like this:
Genre{
name:
books<-->>Book.genre
}
Book{
name:
genre<<-->Genre.books
}
In your master table, you run a fetched results controller to get table of Genre objects. Then the user selects one of the row which behind the scenes selects a particular Genre object.
Since every Genre object has a books relationship that points to the related Book objects, you have automatically got a reference to all the related book objects so you don't have to fetch anything. For your book tableview you just create a sorted array of the Book objects in the selected Genre object's books relationship.
Think of a Core Data object graph as a clump of bead strings all woven together in a web or fabric. The beads are objects and the strings are relationships. A fetch plucks one of the bead/objects from the clump but once you have that bead/object in hand, then you can just pull on its string/relationship to pull out all the beads/objects related to the bead in your hand.
So, fetches are used in most cases just to find the starting objects, then you walk relationships to find most of the other objects.
That is why the Apple docs say you don't need a second fetch.

Resources