What are Indexes in the Xcode Core-Data data model inspector - ios

In Xcode you can add "Indexes" for an entity in the data model inspector.
For the screenshot I did hit "add" twice so "comma,separated,properties" is just the default value.
What exactly are those indexes?
Do they have anything to do with indexed attributes? And if they have what is the difference between specifying the Indexes in this inspector and selecting "Indexed" for the individual attribute?

Optimizing Core Data searches and sorts
As the title says, indexing is to speed up searching and sorting your database. However it slows down saving changes to persistant store. It matters when you are using NSPredicate and NSSortDescriptor objects within your query.
Let's say you have two entities: PBOUser and PBOLocation (many to many). You can see its properties at the image below:
Suppose that in database there is 10,000 users, and 50,000 locations. Now we need to find every user with email starting on a. If we provide such query without indexing, Core Data must check every record (basically 10,000).
But what if it is indexed (in other words sorted by email descending)? --> Then Core Data checks only those records started with a. If Core Data reaches b then it will stop searching because it is obvious that there are no more records whose email starts with a since it is indexed.
How to enable indexing on a Core Data model from within Xcode:
or:
Hopefully they are equivalent:-)
But what if you wanted: Emails started with a and name starts with b You can do this checking INDEXED for name property for PBOUser entity, or:
This is how you can optimise your database:-)

Use the Indexes list to add compound indexes to the entity. A compound index is an index that spans multiple attributes or relationships. A compound index can make searching faster. The names of attributes and relationships in your data model are the most common indexes. You must use the SQLite store to use compound indexes.

Adding a row with a single attribute to the Indexes list is equivalent to selecting Indexed for that attribute: It creates an index for the attribute to speed up searches in query statements.
The Indexes list is meant for compound indexes. Compound indexes are useful when you know that you will be searching for values of these attributes combined in the WHERE clause of a query:
SELECT * FROM customer WHERE surname = "Doe" AND firstname = "Joe";
This statement could make use of a compound index surname, firstname. That index would also be useful if you just search for surname, but not if you only search for firstname. Think of the index as if it were a phone book: It is sorted by surname first, then by first name. So the order of attributes is important.

Related

Does normal indexing also work by creating unique index?

Hi Guys, i would like to know, if i create unique index of two columns on postgreSQL, does normal indexing for the both columns also work by same unique index or i have to create one unique index and two more index for both columns as shown in the code? I want to create unique index of talent_id, job_id, also both columns should separately indexed. I read many resources but does not get appropriate answer.
add_index :talent_actions, [:talent_id, :job_id], unique: true
Does above code also handles below indexing also or i have to add below indexing separately?
add_index :talent_actions, :talent_id
add_index :talent_actions, :job_id
Thank you.
An index is an object in the database, which can be used to look up data faster, if the query planner decides it will be appropriate. So the trivial answer to your question is "no", creating one index will not result in the same structures in the database as creating three different indexes.
I think what you actually want to know is this:
Do I need all three indexes, or will the unique index already optimise all queries?
This, as with any database optimisation, depends on the queries you run, and the data you have.
Here are some considerations:
The order of columns in a multi-column index matters. If you have an index of people sorted by surname then first name, then you can use it to search for everybody with the same surname; but you probably can't use it to search for somebody when you only know their first name.
Data distribution matters. If everyone in your list has the surnames "Smith" and "Jones", then you can use a surname-first index to search for a first name fairly easily (just look up under Jones, then under Smith).
Index size matters. The fewer columns an index has, the more of it fits in memory at once, so the faster it will be to use.
Often, there are multiple indexes the query planner could use, and its job is to estimate the cost of the above factors for the query you've written.
Usually, it doesn't hurt to create multiple indexes which you think might help, but it does use up disk space, and occasionally can cause the query planner to pick a worse plan. So the best approach is always to populate a database with some real data, and look at the query plans for some real queries.

CoreData. What's the difference between indexes and indexed?

I'm looking to speed up queries to my SQL backed CoreData instance (displaying records sorted by date). I know that indexing can help decrease query time, but what's the difference between:
Highlighting the entity that an attribute belongs to, then adding a comma separated list of attributes into the indexes field as seen here:
Or highlighting the attribute, then checking the indexed box as seen here:
Adding a row with a single attribute to the Indexes list is equivalent to selecting Indexed for that attribute: It creates an index for the attribute to speed up searches in query statements.
The Indexes list is meant for compound indexes. Compound indexes are useful when you know that you will be searching for values of these attributes combined in the WHERE clause of a query:
SELECT * FROM customer WHERE surname = "Doe" AND firstname = "Joe";
This statement could make use of a compound index surname, firstname. That index would also be useful if you just search for surname, but not if you only search for firstname. Think of the index as if it were a phone book: It is sorted by surname first, then by first name. So the order of attributes is important.
In your case you should go for the single indexes first (that is, select Indexed for the attributes you like to search for). The compound index you showed could never be used if you just search for babyId, for example.
At WWDC 2017, apple updated this to instead be done by using a Fetch Index(see: https://developer.apple.com/videos/play/wwdc2017/210/?time=997)
To add it, select the entity and then go to Editor -> Add Fetch Index

Is it possible to restrict CoreData entries to a single entity attribute?

Is it possible to restrict a Core Data entry to a single attribute? For example, let's say I have this entity:
Entity
Attribute: name
and there are multiple Entity objects that can be added to the database via a one-to-many relationship. Can I restrict the data entries so that only Entity with different name attributes can be added? I don't want to query the data base every time something is added, because that would cause a performance impact when the database gets larger.
Thanks!
No, you can't.
For now I would have the following ideas.
1 - If the attribute is a string, you should make it as a canonical form (a plain text without accents, etc.). Then you can search with predicates like startsWith or endsWith.
2 - You could add another attribute in entity that you use as a hash value. That hash will be generated when you insert a new object. When you insert a new value, you will check against value.
3 - Indexing the attribute to improve performances.
Core Date cannot check automatically for duplicate values, so you have to check first
if an object with a given value exists before inserting a new one.
If you have to insert many objects, then it is more efficient to fetch all objects having
values from the new list first instead of many fetch requests.
This is described in "Implementing Find-or-Create Efficiently" in the "Core Data Programming Guide".

Entity Framework and foreign key relationships producing slow sql performance

I have normalized a Country/region/city database into multiple tables. City has a foreign key to region which has a foreign key to country.
The CITY table includes 2 additional columns for finding the associated numerical IPAddress. As you can imagine the city table has over 4 million records (representing the cities in the world which maps back to a region and then a country).
CITY, REGION, COUNTRY are entities that I have mapped with Entity Framework power tools, that all have a name column (that represents a cityname, regionname, countryname, respectively), and a primary key IDENTITY column that is indexed.
Let's say I have a table / entity called VisitorHit that has the following columns:
id as int (primary key, identity)
dateVisited as datetime
FK_City as int (which has a many to one relationship to the CITY entity)
In code I use the VisitorHit entity like:
var specialVisitors = VisitorRepository.GetAllSpecialVisitors();
var distinctCountries = specialVisitors.Select(i => i.City.CityName).Distinct().ToArray();
now the GetAllSpecialVisitors returns a subset of the actual visitors (and it works pretty fast). The typical subset contains approximately 10,000 rows. The Select Distinct statement takes minutes to return. Ultimately I need to further delimit the distinctCountries by a date range (using the visitorhit.datevisited field) and return the count for each distinctCountry.
Any ideas on how I could speed up this operation?
Have you looked at SQL Profiler to see what SQL is being generated for this. My first guess (since you don't post the code for GetAllSpecialVisitors) would be that you are lazy loading the City rows in which case you are going to be producing multiple calls to the database (one for each instance in specialVisitors) to get the city. You can eager load the city in the call to GetAllSpecialVisistors().
Use .Include("City") or .Include(v=>v.City)
e.g. Something like this:
var result = from hit in context.VisitorHits
where /* predicates */
.Include(h =>h.City)
Like I said, you need to look at what the SQL Profiler is showing you to see what SQL Is actually being sent to the SQL Server. But when I have issues like this it turns out to be the most common cause.
If you try writing the query yourself in the SSMS and it works well then another solution may be to write a view and query on the view. That is something else I've done on occasion when Entity Framework produces unwieldy queries that don't work efficiently.

rails query serializable object

How do I query the database to find objects that contain one or more attributes that are stored as serializable?
For example, I have a concert which occurs only in certain cities. I want to make a Concert object with a column called cities and store an array of cities.
If I want to query my database to find all concerts that occur in 1 city (or all concerts that occur in an array of n cities), how do I do this?
The best way to do this isn't to store it in a serialized column, but a separate table called Cities. Then you can do this:
City.find_by_name('Cityname').concerts
One possible way to query would be to use SQL's LIKE condition. This would work for boolean conditions in serialized tables.
For example to find those Users with the 'notification' option on,
users=User.arel_table
User.where(users[:options].matches("%notification: true%"))
As for other type of variables this would not be as feasible.

Resources