CoreData. What's the difference between indexes and indexed? - ios

I'm looking to speed up queries to my SQL backed CoreData instance (displaying records sorted by date). I know that indexing can help decrease query time, but what's the difference between:
Highlighting the entity that an attribute belongs to, then adding a comma separated list of attributes into the indexes field as seen here:
Or highlighting the attribute, then checking the indexed box as seen here:

Adding a row with a single attribute to the Indexes list is equivalent to selecting Indexed for that attribute: It creates an index for the attribute to speed up searches in query statements.
The Indexes list is meant for compound indexes. Compound indexes are useful when you know that you will be searching for values of these attributes combined in the WHERE clause of a query:
SELECT * FROM customer WHERE surname = "Doe" AND firstname = "Joe";
This statement could make use of a compound index surname, firstname. That index would also be useful if you just search for surname, but not if you only search for firstname. Think of the index as if it were a phone book: It is sorted by surname first, then by first name. So the order of attributes is important.
In your case you should go for the single indexes first (that is, select Indexed for the attributes you like to search for). The compound index you showed could never be used if you just search for babyId, for example.

At WWDC 2017, apple updated this to instead be done by using a Fetch Index(see: https://developer.apple.com/videos/play/wwdc2017/210/?time=997)
To add it, select the entity and then go to Editor -> Add Fetch Index

Related

Should I use one-column index if the column is already used in a multiple-column index?

I have a query that I optimize by using a postgresql index
(called index1) add_index(:deal, [:partner_id, :partner_status, :deal_user_connect_datetime])
I have another query inside the Admin panel that queries all the partners inside the Deal model.
Should I create another index like so (add_index(:deal, [:partner_id]),
with only 'partner_id' as 'filter' criteria for the index or should I assume that, since it is already one of the column used in index1, it would overlap/be redundant/not be useful as the database already implements a index on partner_id.
No, you shouldn't create another index -- not because it is already one of the column used in index1, but because it is the leading column in the index, and is therefore just as capable of optimising predicates based on partner_id as an index only on partner_id.
In some circumstances it may be better, in fact. For example, if partner_id was nullable and you wanted to use a predicate such as "partner_id is null", then on many RDBMSs an index on partner_id would not be usable. However, if any one of the other columns was non-nullable then the optimiser would know that there is a value of partner_id in the index for every row in the table, and hence it could be used.
A multicolumn index is going to be larger than a single column index, but in most instances this is not going to be prejudicial to performance in comparison to having both indexes present, as you are then using even more disk space and requiring more memory to operate efficiently,
As usual, the answer is "it depends." The multicolumn index can be used since the first value is partner_id. However, if the other columns have many different values it might be more efficient to create a second index only for that single column.
This requires benchmarking and/or more information on the data in the database.

ActiveRecord Query method: group

I have read the Ruby docs on the query method "group", but I am having a hard time understanding how to use it.
lets say I have a table called users, and there are the fields name, email, gender.
I am able to type User.group(:name).count, which return a a hash with key value pairs of {name: count}.
Why does User.group(:name) not work?
Is there a way of grouping similar names, and accessing those records?
ex. User.group(:name).first or User.group(:name).each
It seems to me that I am thinking of using "group" incorrectly.
Why does User.group(:name) not work?
When you are using GROUP BY in SQL it needs a SELECT clause too. But it was absent in your case, and that throws error.
In your first case the query was SELECT COUNT(*) from users GROUP BY name, and this is the reason it worked.
As per your last sentence you need:
User.group(:name).select(:name).each do |record|
# work with record
end
I don't know what is the DB client you are using, but here is the idea from Postgresql GROUP BY documentation.
GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. expression can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name.
Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group (whereas without GROUP BY, an aggregate produces a single value computed across all the selected rows). When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column.

ActiveRecord grouping

I have this query in a project model:
report = self.reports.group(:key_id)
report.select('key_id, count(*) as count')
What do I need to add in order to get another column (level) from reports table?
I tried adding my column to select but that means that I have to group it as well and I only want to get the unique records by key_id
Thank you
If you want to include information about another field, then you have to include that field in the group expression or as part of an aggregate field. That's a fundamental aspect of SQL.
For example, if you want to count the number of occurrences of various values of level associated with each key_id then you can add a count(level) column. The aggregation field can get arbitrarily "fancy", such as counting up the number of occurrences of level within various bands as you've mentioned in your comment.

Delete attribute / column from simpledb

I have a simpledb column 'Status' I want to get rid of it. How can I delete it ? I dont see any intuitive way to do so.
Thanks
Since SimpleDB is a schema-less database each item may have different sets of attributes. In order to remove a particular attribute from all items you're going to need the itemNames for all items containing the attribute.
If you've decided to emulate a relational table in SimpleDB (by having one domain per 'table' and uniform attributes per item) you can retrieve all itemNames by a simple select query select itemName from domainX.
Once you've got the itemNames for the items which contain your unwanted attribute you'll need to call DeleteAttributes once for each item.

What are Indexes in the Xcode Core-Data data model inspector

In Xcode you can add "Indexes" for an entity in the data model inspector.
For the screenshot I did hit "add" twice so "comma,separated,properties" is just the default value.
What exactly are those indexes?
Do they have anything to do with indexed attributes? And if they have what is the difference between specifying the Indexes in this inspector and selecting "Indexed" for the individual attribute?
Optimizing Core Data searches and sorts
As the title says, indexing is to speed up searching and sorting your database. However it slows down saving changes to persistant store. It matters when you are using NSPredicate and NSSortDescriptor objects within your query.
Let's say you have two entities: PBOUser and PBOLocation (many to many). You can see its properties at the image below:
Suppose that in database there is 10,000 users, and 50,000 locations. Now we need to find every user with email starting on a. If we provide such query without indexing, Core Data must check every record (basically 10,000).
But what if it is indexed (in other words sorted by email descending)? --> Then Core Data checks only those records started with a. If Core Data reaches b then it will stop searching because it is obvious that there are no more records whose email starts with a since it is indexed.
How to enable indexing on a Core Data model from within Xcode:
or:
Hopefully they are equivalent:-)
But what if you wanted: Emails started with a and name starts with b You can do this checking INDEXED for name property for PBOUser entity, or:
This is how you can optimise your database:-)
Use the Indexes list to add compound indexes to the entity. A compound index is an index that spans multiple attributes or relationships. A compound index can make searching faster. The names of attributes and relationships in your data model are the most common indexes. You must use the SQLite store to use compound indexes.
Adding a row with a single attribute to the Indexes list is equivalent to selecting Indexed for that attribute: It creates an index for the attribute to speed up searches in query statements.
The Indexes list is meant for compound indexes. Compound indexes are useful when you know that you will be searching for values of these attributes combined in the WHERE clause of a query:
SELECT * FROM customer WHERE surname = "Doe" AND firstname = "Joe";
This statement could make use of a compound index surname, firstname. That index would also be useful if you just search for surname, but not if you only search for firstname. Think of the index as if it were a phone book: It is sorted by surname first, then by first name. So the order of attributes is important.

Resources