Optimize Postgres Query with Indexes for large amounts of data - ruby-on-rails

I've got a database, posts that has about 20 million rows in it. I'm trying to narrow down the posts for a paginated list using the following query:
SELECT "posts".* FROM "posts"
WHERE "posts"."source_id" IN (14790, 14787, 32928, 14796, 14791, 15503, 14789, 14772, 15506, 14794, 15543, 31615, 15507, 15508, 14800)
AND "posts"."deleted_at" IS NULL
ORDER BY external_created_at desc LIMIT 100 OFFSET 0;
(There are about 3.3 million rows that match the source_id in the query)
When I do so, it takes about 60s, and I get the following EXPLAIN ANALYZE (see on depesz):
EXPLAIN ANALYZE SELECT "posts".* FROM "posts" WHERE "posts"."source_id" IN (14790, 14787, 32928, 14796, 14791, 15503, 14789, 14772, 15506, 14794, 15543, 31615, 15507, 15508, 14800) AND "posts"."deleted_at" IS NULL O
RDER BY external_created_at desc LIMIT 100 OFFSET 0;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2530223.38..2530223.63 rows=100 width=1040) (actual time=66564.583..66564.616 rows=100 loops=1)
-> Sort (cost=2530223.38..2534981.19 rows=1903125 width=1040) (actual time=66564.571..66564.594 rows=100 loops=1)
Sort Key: external_created_at
Sort Method: top-N heapsort Memory: 89kB
-> Bitmap Heap Scan on posts (cost=35499.76..2457487.31 rows=1903125 width=1040) (actual time=279.640..64496.330 rows=1674072 loops=1)
Recheck Cond: ((source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[])) AND (deleted_at IS NULL))
Rows Removed by Index Recheck: 4640188
-> Bitmap Index Scan on index_on_posts_partial_source_id_with_order (cost=0.00..35023.98 rows=1903125 width=0) (actual time=275.922..275.922 rows=1674072 loops=1)
Index Cond: (source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[]))
Total runtime: 66564.962 ms
(10 rows)
This is the index that it is using:
CREATE INDEX index_on_posts_partial_source_id_with_order ON posts USING btree (source_id) WHERE (deleted_at IS NULL);
It seems that the Recheck Cond is the slowest thing about this query. Everything I see about Recheck Conditions involve upping the memory that postgres uses because the data is "lossy" but I'm not seeing anything like that in my query plan.
Any recommendations as to how I can speed this up?
It seems like somehow getting rid of the Recheck, or somehow ordering by external_created_at will be my best bets.
Edit: I am using postgres version 9.3.4. Here is the posts table:
CREATE TABLE posts (
id integer NOT NULL,
source_id integer,
message text,
image text,
external_id text,
created_at timestamp without time zone,
updated_at timestamp without time zone,
external text,
like_count integer DEFAULT 0 NOT NULL,
comment_count integer DEFAULT 0 NOT NULL,
external_created_at timestamp without time zone,
deleted_at timestamp without time zone,
poster_name character varying(255),
poster_image text,
poster_url character varying(255),
poster_id text,
"position" integer,
location character varying(255),
description text,
video text,
rejected_at timestamp without time zone,
deleted_by character varying(255),
height integer,
width integer
);

Your query is returning a couple million rows for a paginated list. Think hard about the wisdom of returning data for that many pages. Also, think hard about whether you need all the columns. I doubt that you do.
I built a rough table and inserted about 10 million random(ish) rows into it. My query plan using PostgreSQL 9.4 is roughly similar to yours.
"Limit (cost=138609.10..138609.35 rows=100 width=24) (actual time=1410.012..1410.038 rows=100 loops=1)"
" -> Sort (cost=138609.10..140344.25 rows=694059 width=24) (actual time=1410.010..1410.026 rows=100 loops=1)"
" Sort Key: external_created_at"
" Sort Method: top-N heapsort Memory: 29kB"
" -> Bitmap Heap Scan on posts (cost=12217.47..112082.66 rows=694059 width=24) (actual time=374.393..919.687 rows=3000000 loops=1)"
" Recheck Cond: ((source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[])) AND (deleted_at IS NULL))"
" Heap Blocks: exact=16217"
" -> Bitmap Index Scan on index_on_posts_partial_source_id_with_order (cost=0.00..12043.95 rows=694059 width=0) (actual time=370.593..370.593 rows=3000000 loops=1)"
" Index Cond: (source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[]))"
"Planning time: 0.264 ms"
"Execution time: 1410.097 ms"
Adding an index to external_created_at dropped the execution time by a factor of about 470. But I don't have the same distribution of values that you have.
create index on test.posts (external_created_at);
analyze test.posts;
explain analyze
select * from test.posts
where source_id in (14790, 14787, 32928, 14796, 14791, 15503, 14789, 14772, 15506, 14794, 15543, 31615, 15507, 15508, 14800)
and deleted_at is null
order by external_created_at desc limit 100 offset 0;
"Limit (cost=0.43..131.43 rows=100 width=24) (actual time=0.219..2.992 rows=100 loops=1)"
" -> Index Scan Backward using posts_external_created_at_idx on posts (cost=0.43..900991.48 rows=687808 width=24) (actual time=0.216..2.976 rows=100 loops=1)"
" Filter: ((deleted_at IS NULL) AND (source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[])))"
" Rows Removed by Filter: 350"
"Planning time: 0.302 ms"
"Execution time: 3.024 ms"

Related

GreenPlum choosing a bad query plan in join query

Please forgive my poor English
I have two tables in greenplum(version is: PostgreSQL 9.4.20 (Greenplum Database 6.0.0-beta.3) )
one table is : cookie_session
CREATE TABLE "ods_overall_cookie"."cookie_session" (
"site_cookie" varchar(80) COLLATE "pg_catalog"."default",
"createtime" timestamp(6),
"analyse_domain_cookie" varchar(30) COLLATE "pg_catalog"."default",
"id" int4 NOT NULL,
.... other fields....
)
DISTRIBUTED by(analyse_domain_cookie)
;
CREATE INDEX "index_cookie_session_id" ON "ods_overall_cookie"."cookie_session" USING btree (
"id" "pg_catalog"."int4_ops" ASC NULLS LAST
);
CREATE INDEX "index_analysis_domain_cookie_btree" ON "ods_overall_cookie"."cookie_session" USING btree (
"analyse_domain_cookie" COLLATE "pg_catalog"."default" "pg_catalog"."text_ops" ASC NULLS LAST
);
and another table :ta202202
CREATE TABLE "ods_log"."ta202202" (
"id" serial8,
"uvcookie" varchar(50) COLLATE "pg_catalog"."default",
.... other fields ...
) distributed by (uvcookie)
;
CREATE INDEX "index_ta202202_id" ON "ods_log"."ta202202" USING btree (
"id" "pg_catalog"."int8_ops" ASC NULLS LAST
);
CREATE INDEX "indev_ta202202_uvcookie" ON "ods_log"."ta202202" USING btree (
"uvcookie" COLLATE "pg_catalog"."default" "pg_catalog"."text_ops" ASC NULLS LAST
);
The two tables have about 100 million data respectively.
My query sql is:
select o.id,g.site_cookie
from ods_log.ta202201 o
join ods_overall_cookie.cookie_session as g
on g.analyse_domain_cookie = o.uvcookie
WHERE o.ID BETWEEN 20000000 and 20000077;
this query return in 0.14 seconds, explain ANALYZE result is:
Gather Motion 24:1 (slice1; segments: 24) (cost=0.00..434.40 rows=1 width=41) (actual time=1.785..4.098 rows=552 loops=1)
-> Nested Loop (cost=0.00..434.40 rows=1 width=41) (actual time=0.225..1.948 rows=276 loops=1)
Join Filter: true
-> Index Scan using index_ta202201_id on ta202201 (cost=0.00..6.02 rows=3 width=25) (actual time=0.100..0.142 rows=8 loops=1)
Index Cond: ((id >= 20000000) AND (id <= 20000077))
-> Index Scan using index_analysis_domain_cookie_btree on cookie_session (cost=0.00..428.38 rows=1 width=33) (actual time=0.013..0.213 rows=34 loops=8)
Index Cond: ((analyse_domain_cookie)::text = (ta202201.uvcookie)::text)
Planning time: 59.930 ms
(slice0) Executor memory: 216K bytes.
(slice1) Executor memory: 156K bytes avg x 24 workers, 156K bytes max (seg0).
(slice2)
Memory used: 128000kB
Optimizer: Pivotal Optimizer (GPORCA) version 3.39.0
Execution time: 26.725 ms
It seems use Nested Loop
But when I increase the ID range in the where condition,Even if only plus 1, like: o.ID BETWEEN 20000000 and 20000078,
Time consumption has become 25 seconds, increasing 200 times
Gather Motion 24:1 (slice1; segments: 24) (cost=0.00..437.02 rows=1 width=41) (actual time=10266.694..23884.316 rows=557 loops=1)
-> Hash Join (cost=0.00..437.02 rows=1 width=41) (actual time=12256.944..23881.566 rows=276 loops=1)
Hash Cond: ((ta202201.uvcookie)::text = (cookie_session.analyse_domain_cookie)::text)
Extra Text: (seg0) Initial batch 0:
(seg0) Wrote 874907K bytes to inner workfile.
(seg0) Wrote 1K bytes to outer workfile.
(seg0) Overflow batches 1..7:
(seg0) Read 1200209K bytes from inner workfile: 171459K avg x 7 nonempty batches, 335761K max.
(seg0) Wrote 766456K bytes to inner workfile: 127743K avg x 6 overflowing batches, 304587K max.
(seg0) Read 1K bytes from outer workfile: 1K avg x 4 nonempty batches, 1K max.
(seg0) Wrote 1K bytes to outer workfile.
(seg0) Secondary Overflow batches 8..32767:
(seg0) Read 2014970K bytes from inner workfile: 9201K avg x 219 nonempty batches, 258871K max.
(seg0) Wrote 1573816K bytes to inner workfile: 12107K avg x 130 overflowing batches, 247277K max.
(seg0) Read 1K bytes from outer workfile.
(seg0) Hash chain length 4.2 avg, 4645100 max, using 3735148 of 59506688 buckets. Skipped 32541 empty batches.
-> Index Scan using index_ta202201_id on ta202201 (cost=0.00..6.02 rows=4 width=25) (actual time=0.380..0.428 rows=8 loops=1)
Index Cond: ((id >= 20000000) AND (id <= 20000078))
-> Hash (cost=431.00..431.00 rows=1 width=51) (actual time=12253.540..12253.540 rows=15647864 loops=1)
-> Seq Scan on cookie_session (cost=0.00..431.00 rows=1 width=51) (actual time=0.058..5175.550 rows=15647865 loops=1)
Planning time: 62.416 ms
(slice0) Executor memory: 184K bytes.
* (slice1) Executor memory: 245659K bytes avg x 24 workers, 375566K bytes max (seg0). Work_mem: 290371K bytes max, 1149907K bytes wanted.
Memory used: 128000kB
Memory wanted: 1150306kB
Optimizer: Pivotal Optimizer (GPORCA) version 3.39.0
Execution time: 23927.425 ms
My query plain from Nested Loop changed to Hash Join, seems greenplum choosing a bad query plan.
I continue to adjust my between condition, result is:
from
to
plan
speed
20000000
20000077
Nested Loop
Fast
20000000
20000078
Hash Join
Slow
20000001
20000078
Nested Loop
Fast
20000001
20000079
Hash Join
Slow
20000002
20000079
Nested Loop
Fast
30000000
30000068
Nested Loop
Fast
30000000
30000069
Hash Join
Slow
30000001
30000069
Nested Loop
Fast
I tried:
set enable_nestloop= on;
set enable_hashjoin = off;
set enable_mergejoin = off;
or change my query like :
select xxx form a,b where a.id between xxx and xxx and a.uvcookie = b.analyse_domain_cookie
or change to: left join / inner join / full join
But things haven't changed.
So: Please help me tell me how I should adjust. I want to query the data of the table in pages according to the span of 1000 for ID field. At present, if greenplum continue to use the nested loop plan, it is obviously faster than Hash Join
According to kaiwen's answer, execute
SET optimizer_enable_hashjoin = OFF;
SET optimizer_enable_tablescan = OFF;
it works, The problem is finally solved

Multi-column indices ordering by date and created_at exhibit strange behavior for different queries

On postgres 10, I have a query like so, for a table with millions of rows, to grab the latest posts belonging to classrooms:
SELECT "posts".*
FROM "posts"
WHERE "posts"."school_id" = 1
AND "posts"."classroom_id" IN (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
ORDER BY date desc, created_at desc
LIMIT 30 OFFSET 30;
Assume that classrooms only belong to one school.
I have an index like so:
t.index ["date", "created_at", "school_id", "classroom_id"], name: "optimize_post_pagination"
When I run the query, it does an index scan backwards like I'd hope and return in 0.7ms.
Limit (cost=127336.95..254673.34 rows=30 width=494) (actual time=0.189..0.242 rows=30 loops=1)
-> Index Scan Backward using optimize_post_pagination on posts (cost=0.56..1018691.68 rows=240 width=494) (actual time=0.103..0.236 rows=60 loops=1)
Index Cond: (school_id = 1)
" Filter: (classroom_id = ANY ('{10,11,...}'::integer[]))"
Planning time: 0.112 ms
Execution time: 0.260 ms
However, when I change the query to only include a couple classrooms:
SELECT "posts".*
FROM "posts"
WHERE "posts"."school_id" = 1
AND "posts"."classroom_id" IN (10, 11)
ORDER BY date desc, created_at desc
LIMIT 30 OFFSET 30;
It freaks out and does a lot of extra work, taking nearly 4 sec:
-> Sort (cost=933989.58..933989.68 rows=40 width=494) (actual time=3857.216..3857.219 rows=60 loops=1)
" Sort Key: date DESC, created_at DESC"
Sort Method: top-N heapsort Memory: 61kB
-> Bitmap Heap Scan on posts (cost=615054.27..933988.51 rows=40 width=494) (actual time=2700.871..3851.518 rows=18826 loops=1)
Recheck Cond: (school_id = 1)
" Filter: (classroom_id = ANY ('{10,11}'::integer[]))"
Rows Removed by Filter: 86099
Heap Blocks: exact=29256
-> Bitmap Index Scan on optimize_post_pagination (cost=0.00..615054.26 rows=105020 width=0) (actual time=2696.385..2696.385 rows=104925 loops=1)
Index Cond: (school_id = 485)
What's even stranger is that if I drop the WHERE clause for school_id, both cases for classrooms (with a few or with many) runs fast with the backwards index scan.
This index cookbook suggests putting the ORDER BY index columns last, like so:
t.index ["school_id", "classroom_id", "date", "created_at"], name: "activity_page_index"
But that makes my queries slower, even though the cost is much lower.
Limit (cost=993.93..994.00 rows=30 width=494) (actual time=208.443..208.452 rows=30 loops=1)
-> Sort (cost=993.85..994.45 rows=240 width=494) (actual time=208.436..208.443 rows=60 loops=1)
" Sort Key: date DESC, created_at DESC"
Sort Method: top-N heapsort Memory: 118kB
-> Index Scan using activity_page_index on posts (cost=0.56..985.56 rows=240 width=494) (actual time=0.032..178.147 rows=102403 loops=1)
" Index Cond: ((school_id = 1) AND (classroom_id = ANY ('{10,11,...}'::integer[])))"
Planning time: 0.132 ms
Execution time: 208.482 ms
Interestingly, with the activity_page_index query, it does not change its behavior when querying with fewer classrooms.
So, a few questions:
With the original query, why would the number of classrooms make such a massive difference?
Why does dropping the school_id WHERE clause make both cases run fast?
Why does dropping the school_id WHERE clause make both cases run fast, even though the index still includes school_id?
How can a high cost query finish quickly (65883 -> 0.7ms) and a lower cost query finish slower (994 -> 208ms)?
Other notes
It is necessary to order by both date and created_at, even though they may seem redundant.
Your first plan as shown seems impossible for your query as shown. The school_id = 1 criterion should show up either as an index condition, or as a filter condition, but you don't show it in either one.
With the original query, why would the number of classrooms make such a massive difference?
With the original plan, it is getting the rows in the desired order by walking the index. Then it gets to stop early once it accumulates 60 rows which meet the non-index criteria. So the more selective those other criteria are, the most of the index it needs to walk before it gets enough rows to stop early. Removing classrooms from the list makes it more selective, so makes that plan look less attractive. At some point, it crosses a line where it looks less attractive than something else.
Why does dropping the school_id WHERE clause make both cases run fast?
You said that every classroom belongs to only one school. But PostgreSQL does not know that, it thinks the two criteria are independent, so gets the overall estimated selectivity by multiplying the two separate estimates. This gives it a very misleading estimate of the overall selectivity, which makes the already-ordered index scan look worse than it really is. Not specifying the redundant school_id prevents it from making this bad assumption about the independence of the criteria. You could create multi-column statistics to try to overcome this, but in my hands this doesn't actually help you on this query until v13 (for reasons I don't understand).
This is about the estimation process, not the execution. So school_id being in the index or not doesn't matter.
How can a high cost query finish quickly (65883 -> 0.7ms) and a lower cost query finish slower (994 -> 208ms)?
"It is difficult to make predictions, especially about the future." Cost estimates are predictions. Sometimes they don't work out very well.

Should Postgres COPY FROM be updating BRIN index?

Imagine a table like...
create table study_value (
id serial primary key,
study_id int not null references study (id),
category text not null,
subcategory int not null,
p_value double precision not null
);
I knew it would have 25+ million rows and they needed to be quickly queryable by the parent study as well as optionally by category and subcategory, so I chose to add a BRIN to it.
create index study_value_idx
on study_value using brin (study_id, category, subcategory);
All data for a given study (1mil+ rows) was inserted in bulk (ordered by category/subcategory) from a buffer via...
copy study_value from stdin with (format csv, header false);
This study data was uploaded sequentially in order of study id, so the insert orderings fully respected the BRIN column order.
The problem I'm seeing is that querying this table on conditions that the BRIN satisfies, eg. select count(*) from study_value where study_id = 3;, is performing a full scan and taking 30+ seconds. The size of the BRIN itself is 48 kb.
If I reindex index study_value_idx, however, queries now take ~100 ms and the index size is over 100 kb.
Everything I've read (in PG docs, on SO, etc.) indicates that one should only need to reindex in very specific situations (eg. data corruption or indexes failing to build).
I did not need to drop the index before loading data and re-create it afterward, because copying 1 million records into the table only took 10 seconds.
Am I doing something wrong? Is there a better way to do this?
Edit:
I forgot to mention that prior to running reindex, I ran analyze study_value and saw no change.
Yep, my mistake. I needed to VACUUM ANALYZE per #a_horse_with_no_name's comment.
I re-created the table and re-imported data. On fresh load, index size is again 48 kb and query is back to ~30 seconds. I had misread the query plan, though - it does use the index, the actual rows are just wildly different from expected.
Aggregate (cost=231550.86..231550.87 rows=1 width=8) (actual time=32233.141..32233.156 rows=1 loops=1)
-> Bitmap Heap Scan on study_value (cost=6226.26..229546.26 rows=801840 width=0) (actual time=6555.954..27253.035 rows=781580 loops=1)
Recheck Cond: (study_id = 920)
Rows Removed by Index Recheck: 22027434
Heap Blocks: lossy=213169
-> Bitmap Index Scan on study_value_idx (cost=0.00..6025.80 rows=801840 width=0) (actual time=16.345..16.352 rows=2132480 loops=1)
Index Cond: (study_id = 920)
Planning time: 0.941 ms
Execution time: 32233.266 ms
After analyze study_value (3 sec) the idx is still 48 kb and query plan is:
Aggregate (cost=231360.49..231360.50 rows=1 width=8) (actual time=25468.247..25468.259 rows=1 loops=1)
-> Bitmap Heap Scan on study_value (cost=6161.41..229376.81 rows=793472 width=0) (actual time=2740.866..20419.470 rows=781580 loops=1)
Recheck Cond: (study_id = 920)
Rows Removed by Index Recheck: 22027434
Heap Blocks: lossy=213169
-> Bitmap Index Scan on study_value_idx (cost=0.00..5963.04 rows=793472 width=0) (actual time=17.301..17.306 rows=2132480 loops=1)
Index Cond: (study_id = 920)
Planning time: 0.101 ms
Execution time: 25468.389 ms
After vacuum analyze study_value (20 sec) the idx is now 112kb and query plan is..
Aggregate (cost=231496.34..231496.35 rows=1 width=8) (actual time=10038.873..10038.884 rows=1 loops=1)
-> Bitmap Heap Scan on study_value (cost=6228.78..229501.25 rows=798037 width=0) (actual time=12.303..5133.281 rows=781580 loops=1)
Recheck Cond: (study_id = 920)
Rows Removed by Index Recheck: 17962
Heap Blocks: lossy=7473
-> Bitmap Index Scan on study_value_idx (cost=0.00..6029.27 rows=798037 width=0) (actual time=1.644..1.650 rows=75520 loops=1)
Index Cond: (study_id = 920)
Planning time: 0.511 ms
Execution time: 10038.993 ms
Executing a more detail query (ie. including category/subcategory) is much faster, maybe ~400 ms.

How to decrease query execution time on a db with 20 million records | Rails, Postgres

I have a Rails app with Postgres db. It has 20 million records. Most of the queries use ILIKE. I have created a triagram index on one of the columns.
Before adding the triagram index, the query execution time was ~200s to ~300s (seconds not ms)
After creating the triagram index, the query execution time came down to ~30s.
How can I reduce the execution time to milliseconds?
Also are there any good practices/suggestions when dealing with a database this huge?
Thanks in advance :)
Ref : Faster PostgreSQL Searches with Trigrams
Edit: 'Explain Analyze' on one of the queries
EXPLAIN ANALYZE SELECT COUNT(*) FROM "listings" WHERE (categories ilike '%store%');
QUERY PLAN
--------------------------------------------------------------------------
Aggregate (cost=716850.70..716850.71 rows=1 width=0) (actual time=199354.861..199354.861 rows=1 loops=1)
-> Bitmap Heap Scan on listings (cost=3795.12..715827.76 rows=409177 width=0) (actual time=378.374..199005.008 rows=691941 loops=1)
Recheck Cond: ((categories)::text ~~* '%store%'::text)
Rows Removed by Index Recheck: 7302878
Heap Blocks: exact=33686 lossy=448936
-> Bitmap Index Scan on listings_on_categories_idx (cost=0.00..3692.82 rows=409177 width=0) (actual time=367.931..367.931 rows=692449 loops=1)
Index Cond: ((categories)::text ~~* '%store%'::text)
Planning time: 1.345 ms
Execution time: 199355.260 ms
(9 rows)
The index scan itself is fast (0.3 seconds), but the trigram index finds more than half a million potential matches. All of these rows have to be checked if they actually match the pattern, which is where the time is spent.
For longer strings or strings with less common letters the performance should be considerably better. Is it a solution for you to impose a lower bound on the length of the search string?
Other than that, maybe the only solution is to use an external text search software.

rails query Timeout::Error: execution expired

I have one simple query, but its showing the Timeout::Error: execution expired, also i am using rack::timeout
SELECT SUM(total_checks) as totalcheck FROM "orders" WHERE
(orders.order_status_id NOT IN (15, 17)) AND (orders.check_id = 36) AND
(orders.pass_id = '49') AND (orders.created_at BETWEEN '2016-02-29
22:00:00.000000' AND '2016-03-02 22:00:00.000000') LIMIT 1
also, i have total orders around 9762797, is there any issue with this query?
Got when did that explain analyze
----------
Limit (cost=153.76..153.77 rows=1 width=5) (actual time=14622.323..14622.324
rows=1 loops=1)
-> Aggregate (cost=153.76..153.77 rows=1 width=5) (actual
time=14622.322..14622.322 rows=1 loops=1)
-> Index Scan using idx_orders_check_and_pass on orders
(cost=0.43..153.76 rows=1 width=5) (actual time=2739.717..14621.649 rows=141
loops=1)
Index Cond: ((check_id = 36) AND (pass_id = 49))
Filter: ((order_status_id <> ALL ('{15,17}'::integer[])) AND
(created_at >= '2016-02-29 22:00:00'::timestamp without time zone) AND
(created_at <= '2016-03-02 22:00:00'::timestamp without time zone))
Rows Removed by Filter: 42396
Total runtime: 14622.524 ms
(7 rows)
You have quite big table to run SUM on. I would suggest to use some caching mechanism to avoid using this query, because 14 seconds is a lot.
For example, I would suggest creating new table total_orders_checks and store total checks there. You would need to update it every time you update orders table total_checks value and it might not suit your app design, but you'll definitely get total_checks out of it much faster.

Resources