Can I add a compound index that includes the table's primary key when using Core Data with a SQLite backing store? - ios

I am working on an iOS 7 app that is using Core Data persisted with a SQLite store. I am trying to optimize for reading.
I've been following the advice in this blog post, which suggests using compound indices that will include columns used in both the where clause and the order by clause of slow queries. However, one of my queries is selecting with a where clause based on the Core Data managed primary key being in a subquery:
SELECT 0, t0.Z_PK, t0.Z_OPT, ... FROM ZMODEL t0
WHERE t0.Z_PK IN (SELECT * FROM _Z_intarray0)
ORDER BY t0.ZTITLE LIMIT 99999
I can't get SQLite to explain the query for me exactly since _Z_intarray0 is a temporary table (or is it?), but I can substitute some values in to (potentially) get some guidance from the database:
explain query plan SELECT 0, t0.Z_PK, t0.Z_OPT, ... FROM ZMODEL t0
WHERE t0.Z_PK IN (1,2,3)
ORDER BY t0.ZTITLE LIMIT 99999;
0|0|0|SEARCH TABLE ZMODEL AS t0 USING INTEGER PRIMARY KEY (rowid=?) (~3 rows)
0|0|0|EXECUTE LIST SUBQUERY 1
0|0|0|USE TEMP B-TREE FOR ORDER BY
I would think that at the database level I could add an index on the primary key and the title column in order to not generate a temporary index for the order by clause; however, Core Data is managing the primary key and I cannot add a multicolumn index with it since it is not an object-level attribute.
Is there a way to speed up this query with an index or another solution?

Related

Event-Time Temporal Table Join requires both primary key and row time attribute in versioned table, but no row time attribute can be found

I have tried to use lookup join but i find this problem:
SELECT
> e.isFired,
> e.eventMrid,
> e.createDateTime,
> r.id AS eventReference_id,
> r.type
> FROM Event e
> JOIN EventReference FOR SYSTEM_TIME AS OF e.createDateTime AS r
> ON r.id = e.eventReference_id;
[ERROR] Could not execute SQL statement. Reason: org.apache.flink.table.api.ValidationException: Event-Time Temporal Table Join requires both primary key and row time attribute in versioned table, but no row time attribute can be found.
Whether that query will be interpreted by the Flink SQL planner as a temporal join or a lookup join depends on the type of the table on the right-hand side. In this case I guess you haven't used a lookup source. And your time attribute might not be defined correctly.
Temporal (time-versioned) joins require
an equality predicate on the primary key of the versioned table
a time attribute
and lookup joins require
a lookup source connector, (e.g., JDBC, HBase, Hive, or something custom)
an equality join predicate
using a processing time attribute in combination with
FOR SYSTEM_TIME AS OF (to prevent needing to update the join results)

Rails add multiple columns after a specific column using sqlite3 [duplicate]

It seems that it is not straightforward for reordering columns in a SQLite3 table. At least the SQLite Manager in Firefox does not support this feature. For example, move the column2 to column3 and move column5 to column2. Is there a way to reorder columns in SQLite table, either with a SQLite management software or a script?
This isn't a trivial task in any DBMS. You would almost certainly have to create a new table with the order that you want, and move your data from one table to the order. There is no alter table statement to reorder the columns, so either in sqlite manager or any other place, you will not find a way of doing this in the same table.
If you really want to change the order, you could do:
Assuming you have tableA:
create table tableA(
col1 int,
col3 int,
col2 int);
You could create a tableB with the columns sorted the way you want:
create table tableB(
col1 int,
col2 int,
col3 int);
Then move the data to tableB from tableA:
insert into tableB
SELECT col1,col2,col3
FROM tableA;
Then remove the original tableA and rename tableB to TableA:
DROP table tableA;
ALTER TABLE tableB RENAME TO tableA;
sqlfiddle demo
You can always order the columns however you want to in your SELECT statement, like this:
SELECT column1,column5,column2,column3,column4
FROM mytable
WHERE ...
You shouldn't need to "order" them in the table itself.
The order in sqlite3 does matter. Conceptually, it shouldn't, but try this experiment to prove that it does:
CREATE TABLE SomeItems (
identifier INTEGER PRIMARY KEY NOT NULL,
filename TEXT NOT NULL, path TEXT NOT NULL,
filesize INTEGER NOT NULL, thumbnail BLOB,
pickedStatus INTEGER NOT NULL,
deepScanStatus INTEGER NOT NULL,
basicScanStatus INTEGER NOT NULL,
frameQuanta INTEGER,
tcFlag INTEGER,
frameStart INTEGER,
creationTime INTEGER
);
Populate the table with about 20,000 records where thumbnail is a small jpeg. Then do a couple of queries like this:
time sqlite3 Catalog.db 'select count(*) from SomeItems where filesize = 2;'
time sqlite3 Catalog.db 'select count(*) from SomeItems where basicScanStatus = 2;'
Does not matter how many records are returned, on my machine, the first query takes about 0m0.008s and the second query takes 0m0.942s. Massive difference, and the reason is because of the Blob; filesize is before the Blob and basicScanStatus is after.
We've now moved the Blob into its own table, and our app is happy.
you can reorder them using the Sqlite Browser

How does sqlite select the index when querying records?

Background 
I am an iOS developer, and we use CoreData which uses sqlite database to store data in the disk in our project. Several days before one of our users said that his interface is not fluent in some case when using our app which version is 2.9.9. After some efforts we finally found that it is due to the bad efficiency when querying records from sqlite. But after updating to the latest version 3.0.6, the issue disappeared. 
Analyze
(1) when querying records from sqlite, the SQL query is
'SELECT * FROM ZAPIOBJECT WHERE ZAPIOBJECTID = "xxx" AND Z_ENT == 34'
In the version 2.9.9 of our app, the schema of table ‘ZAPIOBJECT’ of the sqlite shows
'CREATE INDEX ZAPIOBJECT_Z_ENT_INDEX ON ZAPIOBJECT (Z_ENT);'
'CREATE INDEX ZAPIOBJECT_ZAPIOBJECTID_INDEX ON ZAPIOBJECT (ZAPIOBJECTID);'
and the query plan shows
'0 0 0 SEARCH TABLE ZAPIOBJECT AS t0 USING INDEX ZAPIOBJECT_Z_ENT_INDEX (Z_ENT=?)’
which uses the less efficient index ‘Z_ENT’ (cost ~4s for 1 row).
(2) In the version 3.0.6 of our app, the SQL query is the same:
'SELECT * FROM ZAPIOBJECT WHERE ZAPIOBJECTID = "xxx" AND Z_ENT == 34'
but the schema of table ‘ZAPIOBJECT’ of the sqlite shows:
‘CREATE INDEX ZAPIOBJECT_Z_ENT_INDEX ON ZAPIOBJECT (Z_ENT);’
‘CREATE INDEX Z_APIObject_apiObjectID ON ZAPIOBJECT (ZAPIOBJECTID COLLATE BINARY ASC);’
and the query plan shows
‘0 0 0 SEARCH TABLE ZAPIOBJECT AS t0 USING INDEX Z_APIObject_apiObjectID (ZAPIOBJECTID=?)’
which uses the more efficient index ‘ZAPIOBJECTID’ (cost ~0.03s for 1 row).
(3) the total number of records in the table 'ZAPIOBJECT' is about 130000, and the index ‘ZAPIOBJECTID’ which distinct count is more than 90000 is created by us, while the index ‘Z_ENT’ which distinct count is only 20 is created by CoreData.
(4) the versions of the sqlites in the two versions of our app are the same 3.8.8.3.
Questions
(1) how sqlite select index when querying records? In the document Query Planning I learn that sqlite would select the best algorithms by itself, however in our case selecting the different index can lead to obvious efficiency.Does the difference between the creation of ‘ZAPIOBJECTID’ in two version of our app lead to the different index adopted by sqlite?
(2) It seems that those users whose system version is lower than iOS 11 would have this issue, so how can we solve this problem for them? Can we set ‘ZAPIOBJECTID’ as the designated index with CoreData API?
SQLite uses the index that results in the lowest number of estimated I/O operations.
The details of that estimation change in every version.
See the Checklist For Avoiding Or Fixing Query Planner Problems.

Hive join query returning Cartesian product on inner join

I am doing inner join on two tables that are created using Hive. One is a big table "trades_bucket" and another is a small table "counterparty_bucket". They are created as follows :-
DROP TABLE IF EXISTS trades_bucket;
CREATE EXTERNAL TABLE trades_bucket(
parentId STRING,
BookId STRING) CLUSTERED BY(parentId) SORTED BY(parentId) INTO 32 BUCKETS;
DROP TABLE IF EXISTS counterparty_bucket;
CREATE EXTERNAL TABLE counterparty_bucket(
Version STRING,AccountId STRING,childId STRING)
CLUSTERED BY(childId ) SORTED BY(childId) INTO 32 BUCKETS;
The Join between the tables
SELECT /*+ MAPJOIN(counterparty_bucket) */ BookId , t.counterpartysdsid, c.sds
FROM counterparty_bucket c join trades_bucket t
on c.childId = t.parentId
where c.childId ='10001684'
The problem is that the join is producing Cartesian product out of the two tables. What I mean is if big table has 100 rows and small table has 4 rows for a given id, I expect the join to return 100 rows, but I am getting back 400 rows. Anyone have a clue or anyone witnessed similar situation?

How can I speed up or optimize this SQLite query for iOS?

I have a pretty simple DB structure. I have 12 columns in a single table, most are varchar(<50), with about 8500 rows.
When I perform the following query on an iPhone 4, I've been averaging 2.5-3 seconds for results:
SELECT * FROM names ORDER BY name COLLATE NOCASE ASC LIMIT 20
Doesn't seem like this sort of thing should be so slow. Interestingly, the same query from the same app running on a 2nd gen iPod is faster by about 1.5 seconds. That part is beyond me.
I have other queries that have the same issue:
SELECT * FROM names WHERE SEX = ?1 AND ORIGIN = ?2 ORDER BY name COLLATE NOCASE ASC LIMIT 20
and
SELECT * FROM names WHERE name LIKE ?3 AND SEX = ?1 AND ORIGIN = ?2 ORDER BY name COLLATE NOCASE ASC LIMIT 20
etc.
I've added an index on the SQLite db: CREATE INDEX names_idx ON names (name, origin, sex, meaning) where name, origin, sex and meaning are the columns I tend to query against with WHERE and LIKE operators.
Any thoughts on improving the performance of these searches or is this about as atomic as it gets?
The index CREATE INDEX names_idx ON names (name, origin, sex, meaning) will only be used, I believe, if your query includes ALL those columns. If only some are used in a query, the index can't be used.
Going on your first query: SELECT * FROM names ORDER BY name COLLATE NOCASE ASC LIMIT 20 - I would suggest adding an index on name, just by itself, i.e. CREATE INDEX names_idx1 ON names (name). That should in theory speed up that query.
If you want other indexes with combined columns for other common queries, fair enough, and it may improve query speed, but remember it'll increase your database size.
What is the most used search criteria ? if you search for names for example you could create more tables according to the name initials. A table for names which start with "A" etc. The same for genre. This would improve your search performance in some cases.

Resources