Core Data design principles - ios

I just started reading this guide: https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/CoreData/KeyConcepts.html#//apple_ref/doc/uid/TP40001075-CH30-SW1
And it basically has (in my opinion) two big contradictions:
I get them both, but basically, if I follow the first "implement a custom class to the entity from which classes representing subentities also inherit"-statement, then ALL my entities will be put in the same table. Which could cause performance issues, according to the NOTE.
How big of a performance hit would I run into of it create a "custom super entity"?

You can use the inheritance mechanism to get a default database structure. From your link:
If you have a number of entities that are similar, you can factor the common properties into a superentity, also known as a parent entity.
There is no contradiction. The documentation is just telling you what the database structure is going to be when you use a certain facility. (And it is the standard database table idiom for inheritance.) Using the entity inheritance mechanism automatically declares and implements default parent-child class inheritance functionality along with a parent table. Otherwise you do any parent-child class inheritance declaration and implementation by hand. Each comes with certain performance and other characteristics.
Design involves tradeoffs between costs and benefits over multiple dimensions. "Performance" itself involves multiple dimensions, and has no meaning outside of given application usage patterns. Other dimensions relevant here include complexity of both construction and maintenance.
If you query about entities as parents sufficiently frequently then it can be better to have all parent data in its own table. But if you sufficiently rarely ask for the parent data while querying about a given child type or if you sufficiently frequently need both child and parent data then it can be better to only have parent data in the child tables or table. But notice that each design performs worse at the other kind of query.

The first is talking about sub-entities. The second is talking about subclasses. These are 2 different hierarchies.
One use for sub-entities is if you have a table where you want to show cells displaying different entities. By making them sub-entities, you can fetch the parent entity and all sub-entities will be returned. This is actually how the Notes app shows the "All Notes" cell above folders, that is actually displaying the Account entity, and both Account and Folder are sub-entities of NoteContainer which is what is fetched. This does mean all of the rows are in the same table, but personally I have not experienced any performance problems but it is something to keep in mind when modifying the entities in other ways like indexes, relations or constraints for example.

I'm not familiar with this quirk of SQLite, but modeling base class/subclass relationships are usually done with different tables. There is one table that represents the base class which contains attributes common to all derivative classes (Vehiclea) and a different table for each subclass which contain attributes unique to that subclass (Cars, Trains, Airplanes).
Performance is no better or worse than any entity normalized across different tables.

Related

Inherit from SPManagedObject

In Simperiums iOS/OSX tutorial you say, each modeled object should inherit from SPManagedObject.
I didn't try it yet, but doesn't that lead to one big table in the SQLite database that contains a union of all fields of all modeled managed objects?
Yes, under the hood Core Data will tend to create a bigger table. Generally performance will suffer more from relations though, not inheritance:
Using Parent Entity in CoreData Models
We've done integrations with fairly complex inheritance hierarchies and didn't see any immediate issues with a fair amount of data.
Having said that, should you need more control over your table structure, you can avoid having a single parent for all your objects and instead either:
Manually add the ghostData and simperiumKey attributes to the objects you want to sync, and ensure their class is SPManagedObject (or ensure their custom class inherits from SPManagedObject), or
Create more than one parent entity with ghostData and simperiumKey attributes, and inherit from those for the parts of your model where it makes sense, depending on how you'd like the underlying tables to be structured.

core data structure

Im about to add the persistence layer to my application, and i decided to give core data a go. Currently i map all my models to entities, which seems to work quite well. But in my current implementation i use something i call "collections" (of models) for example i have a collection of tile slots in a game.
this SlotsCollection class has methods like findNextInSameRow() findAvailableSlot() etc. What ive done with core data is i have created a Game entity and added a to many relationship to the Slot entity, Is there a way to define a class which the collection of slots should be instantiated with so i can put my logic inside that? Or is there a better way for me to structure things. I guess i could create "managers" inside my Game entity and hand in the slots when initialized
SlotManager* manager = [SlotManager alloc] initWithSlots:self.slots];
Slot* slot = [manager findAvailableSlot];
Also after i "migrated" all my models to entities, i have alot of entities that do not have any attributes but only hold references to other entities. Im abit afraid im using a wrong mindset when structuring the core data.
The class that has the collection should have the logic for that collection.
If you have a 1-to-many relationship from A to B, then you'd put the logic about this relationship into class A — and possibly some of it inside class B (depending on your needs).
Note: If you're iterating through relationships, you need to be aware of faulting behavior etc. Whenever Core Data has to do actual database work, you incur a performance hit. That's no different that plain old SQL. If you don't have to "go to disk" things are very fast. If you're using fetch request you will always do database work, and things will always be (relatively) expensive.

Is it difficult to manage iInheriting a client dataset from another?

Say I have a client dataset CDSPerson that acts as a wrapper around a Persons database table. Say I have another table, PersonBenefits, that joins 1:1 back to the Persons table.
Say I wrap a Delphi class around CDSPerson, PersonClass, and another class around CDSPersonBenefits, PersonBenefitsClass, to read and write records. PersonBenefitsClass inherits from PersonClass so it can provide data from both tables. I'd like to be able to write data back to either table through PersonBenefitsClass.
Has anyone developed a clean way to handle the SQL query, provider flags and commit logic in the inherited class so that (a) fields stay aligned with the parent class and (b) both database tables can be updated?
Is there a reference for this that I can't find? Is this just a bad idea? I'm using Delphi 2007.
If you're going to develop a business-object-to-database mapping framework, (commonly known as ORM, Object-Relational Mapper,) you're going to need to put in a bit of architecture to make relationships like this work properly. Here's one way to do it:
PersonClass and BenefitsClass both inherit from BusinessObjectClass. BusinessObjectClass is a base class that contains the general logic to interact with the dataset. It has a list object of some sort that contains a list of relation objects.
Each relation object is a special object that contains either one or a list of BusinessObjectClass descendants, plus extra data describing the foreign-key relationship between the two tables. When BusinessObjectClass does its queries and its updates, it needs to iterate through all its relation objects and have them do their own queries and updates as appropriate.
In your composite object, (PersonWithBenefitsClass,) in the constructor, call inherited and then set up a relation object that describes the related BenefitsClass. Make sure that any inserts of new objects are done in the right order to preserve referential integrity.
That's the basic idea. (One basic idea. There are probably plenty of other ways to do it.) I'll leave the details of exactly how you implement it up to you.

What to consider when deciding to use Single Table Inheritance

I'm getting ready to start a small project that provides an opportunity to use single table inheritance. As I read through prior post on STI on Stackoverflow there seems to be some strong opinions on sides of the argument.
My application is related to my horse racing hobby. A horse's connections are defined as its current jockey, trainer and owner. The jockey, trainer and owner could be modeled using three separate tables (models/classes) or as one one class with several sub-classes through single table inheritance.
When faced with a decision like this, is there a check list of questions that one can go through to determine what approach is preferable. I'm assuming that using STI would reduce the number of potential joins. What are the other practical considerations?
There are a few things you should think about:
Are the objects, conceptually, children of a single parent?
Don't use single table inheritance just because your classes share some attributes; make sure there is actually an OO inheritance relationship between each of them and an understandable parent class.
Do you need to do database queries on all objects together?
If you want to list the objects together or run aggregate queries on all of the data, you’ll probably want everything in the same database table for speed and simplicity.
Do the objects have similar data but different behavior?
If you have a larger number of model-specific columns, you should consider polymorphic associations instead.
The article linked goes in depth a bit more.

Modeling a database (ERD) that has quirky behavior

One of the databases that I'm working on has some quirky behavior that I want to account for in the entity-relationship diagram.
One of the behaviors is that there is a 'booking' table and a 'invoice' table. When a 'booking' is invoiced, then the record is inserted into the 'invoice' table and then deleted from the 'booking' table.
However, a reference is still kept of the booking number.
How do we model this? Big arrow between the tables and some text beside it describing what happens?
No, changing the database schema is not possible at this point in time
Edit: This is the type of diagram that I want to use:
alt text http://img813.imageshack.us/img813/5601/erdartistperformssong.png
Link
If, by ERD, you mean the original "Chen" diagrams where the relationship was words written in a diamond, then you have a relationship between between Booking and Invoice. It's a special kind of relationship that's NOT implemented with a simple foreign key; it's implemented via a complicated move and a constraint.
If, by ERD, you mean the diagrams that ERwin draws, then you don't have an easy way to do this. It tends to focus you on drawing PK-FK relationships. You have a non-PK-FK relationship between these things. Some kind of line with text is about all you can do.
Arrows, BTW, aren't appropriate because the ERD shows the "state" of the database. Data flowing around isn't part of an ERD. You do have a relationship, it's just not a typical PK-FK relationship. It's an atypical relationship based on rows existing in some places and not existing in others.
In the UML you can easily draw this as a "constraint" among the relationships.
I don't know what these people are talking about.
The Entity Relation Diagram doesn't describe the data fully; yes of course, it only shows Entities and Relations, it doesn't show Attributes. That's why it is called an ERD and not a Data Model. Evidently many people here can't tell the difference.
The Data Model is supposed to show as much as possible. But it depends on (a) the standard [if any] that you use and (b) the Notation. Some show more than others. IDEF1X which is the only Relational modelling Standard (NIST 184 of 1993). It is the most complete, and shows intricacies and complexities that other notations do not show. Recently MS and others have come out with "simplified" notations, of course, much is lost in the "ERDs".
It is not "process flow", it is a relation in a database.
UML is completely inappropriate for modelling data, especially when there is at least one Standard plus several non-standard but commonly used data modelling notations. There is nothing that can be shown in UML that can't be shown in IDEF1X. But most developers here have never heard of it (developers should not be modelling unless they have acquired modelling skills, but that is another story)..
This is a perfectly legal; it may not be commonly known, but it is legal and named. It is a Supertype-Subtype relation, except that the Cardinality is 1::0-n instead of 1::0-1. The IDEF1X Notation (right) has a Subtype symbol. Note there is only one relation at the parent end; and one each at the child end. And of course the crows feet show the cardinality. These relations can be Exclusive or Non-exclusive; yours is Exclusive; that is what the X through the half-circle means.
ERwin is the only modelling (not diagramming) tool that implements IDEF1X, and thus has the full complement of the IDEF1X Notation.
Of course, the Standard, the modelling capability, are all in the mind, not in the tool. I draw Data Models that are IDEF1X-compliant using a simple drawing tool.
I find that some developers baulk at the Subtype symbol, so I show a simplified version (left) in my IDEF1X models; it is intended to convey the sense of exclusivity, while the retention of the single line at the parent end indicates it is a subtype.
Lott: Click here▶Link to Data Model◀Lott: Click here
Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.
Sounds like a process flow, not an entity relationship. If at the time the entry is added to invoice, and the entry is deleted from booking, then there is never a relationship between the two. There is never a situation where you can traverse that relationship because there is never a record in both places that can be related together.
ERD don't describe the database fully. There are other things like process flow and use cases that detail other facets of the system.
This is kind of an analogy to UML for software. A class diagram doesn't show you all the different ways classes interact. One class might initialize locally and call functions of another class, but because there is not composition or inheritance that relates those two classes, then the class diagram doesn't show this relationship. Only when you fully document the system with all the various types of diagrams can you see all the facets of how it operates.

Resources