Cannot add cell to table content: total cell count of 837114417 exceeded - psql

I tried to make an inner join query between two tables:
SELECT * FROM Tweets INNER JOIN Users ON (Tweets.user_0 = Users.id);
And this error message appeared:
Postgresql: Cannot add cell to table content: total cell count of 837114417 exceeded.
These tables (Tweets and Users) are very big, But is there a way to execute this query anyway?
When I ran this query on the same table with fewer lines the query worked properly.
Thanks a lot.

This error stems from your psql frontend, while trying to format and display a huge number of rows and columns. The code is in src/fe_utils/print.c around line 3000.
The query was actually executed, but when the DBMS sends the result back to psql , it is too much data to swallow. (psql needs to buffer the complete result, just to estimate the needed column widths.)

Related

DB2 DELETE query to return count

I have very minimal knowledge in writing dynamic queries. As part some implementation I have a need to write a DB2 DELETE query which should be able to delete the rows, as well as it should return the count of rows affected.
So that this query will be put in a DB2 stored procedure, where I have have this count as an OUT parameter.
I was trying as below, which is returning the count but doesn't delete the rows.
SELECT COUNT(STUDENT_ID) AS DELETE
FROM STUDENT
WHERE STUDENT_LOCATION = 'TNAGAR'
AND DATE(JOINING_DATE) < CURRENT DATE - 120 MONTHS;
However this can be achieved using two individual queries i.e. one for select and another for delete, but I am looking for one single query to achieve this.
If you are using SQL PL procedures in Db2, you can use the GET DIAGNOSTICS statement to return the number of rows affected by a previous insert/update/delete. See the documentation at this page.
Example:
declare v_rows_affected integer default 0;
...
DELETE FROM ...
get diagnostics v_rows_affected = row_count ;
If you are using a programming language other than SQL PL, with access to the SQLCA, then this information is also present in a part of the SQLCA (specifically SQLERRD(3)).

How to get row Count of the sqlite3_stmt *statement? [duplicate]

I want to get the number of selected rows as well as the selected data. At the present I have to use two sql statements:
one is
select * from XXX where XXX;
the other is
select count(*) from XXX where XXX;
Can it be realised with a single sql string?
I've checked the source code of sqlite3, and I found the function of sqlite3_changes(). But the function is only useful when the database is changed (after insert, delete or update).
Can anyone help me with this problem? Thank you very much!
SQL can't mix single-row (counting) and multi-row results (selecting data from your tables). This is a common problem with returning huge amounts of data. Here are some tips how to handle this:
Read the first N rows and tell the user "more than N rows available". Not very precise but often good enough. If you keep the cursor open, you can fetch more data when the user hits the bottom of the view (Google Reader does this)
Instead of selecting the data directly, first copy it into a temporary table. The INSERT statement will return the number of rows copied. Later, you can use the data in the temporary table to display the data. You can add a "row number" to this temporary table to make paging more simple.
Fetch the data in a background thread. This allows the user to use your application while the data grid or table fills with more data.
try this way
select (select count() from XXX) as count, *
from XXX;
select (select COUNT(0)
from xxx t1
where t1.b <= t2.b
) as 'Row Number', b from xxx t2 ORDER BY b;
just try this.
You could combine them into a single statement:
select count(*), * from XXX where XXX
or
select count(*) as MYCOUNT, * from XXX where XXX
To get the number of unique titles, you need to pass the DISTINCT clause to the COUNT function as the following statement:
SELECT
COUNT(DISTINCT column_name)
FROM
'table_name';
Source: http://www.sqlitetutorial.net/sqlite-count-function/
For those who are still looking for another method, the more elegant one I found to get the total of row was to use a CTE.
this ensure that the count is only calculated once :
WITH cnt(total) as (SELECT COUNT(*) from xxx) select * from xxx,cnt
the only drawback is if a WHERE clause is needed, it should be applied in both main query and CTE query.
In the first comment, Alttag said that there is no issue to run 2 queries. I don't agree with that unless both are part of a unique transaction. If not, the source table can be altered between the 2 queries by any INSERT or DELETE from another thread/process. In such case, the count value might be wrong.
Once you already have the select * from XXX results, you can just find the array length in your program right?
If you use sqlite3_get_table instead of prepare/step/finalize you will get all the results at once in an array ("result table"), including the numbers and names of columns, and the number of rows. Then you should free the result with sqlite3_free_table
int rows_count = 0;
while (sqlite3_step(stmt) == SQLITE_ROW)
{
rows_count++;
}
// The rows_count is available for use
sqlite3_reset(stmt); // reset the stmt for use it again
while (sqlite3_step(stmt) == SQLITE_ROW)
{
// your code in the query result
}

IOS Sqlite Database - Retriving batch by batch records

I am using sqlite3 database in my project. for that I can retrive the data from the Database using following query "select * from tablename"..
But I want to take the hundred sequence records from the database, like If I scroll the UITableView based on the I want to take 100 100 records.
I have tried the following things,
SELECT * FROM mytable ORDER BY record_date DESC LIMIT 100; - It retrives only 100 records.When I scroll the table i want to fetch the next 100 records and show it.
Is it possible to do it
Please Guide me.
You could simply use the OFFSET clause, but this would still force the database to compute all the records that you're skipping over, so it would become inefficient for a larger table.
What you should do is to save the last record_date value of the previous page, and continue with the following ones:
SELECT *
FROM MyTable
WHERE record_date < ?
ORDER BY record_date DESC
LIMIT 100
See https://www.sqlite.org/cvstrac/wiki?p=ScrollingCursor for details.

How to efficiently search for last record matching a condition in Rails and PostgreSQL?

Suppose you want to find the last record entered into the database (highest ID) matching a string: Model.where(:name => 'Joe'). There are 100,000+ records. There are many matches (say thousands).
What is the most efficient way to do this? Does PostgreSQL need to find all the records, or can it just find the last one? Is this a particularly slow query?
Working in Rails 3.0.7, Ruby 1.9.2 and PostgreSQL 8.3.
The important part here is to have a matching index. You can try this small test setup:
Create schema xfor testing:
-- DROP SCHEMA x CASCADE; -- to wipe it all for a retest or when done.
CREATE SCHEMA x;
CREATE TABLE x.tbl(id serial, name text);
Insert 10000 random rows:
INSERT INTO x.tbl(name) SELECT 'x' || generate_series(1,10000);
Insert another 10000 rows with repeating names:
INSERT INTO x.tbl(name) SELECT 'y' || generate_series(1,10000)%20;
Delete random 10% to make it more real life:
DELETE FROM x.tbl WHERE random() < 0.1;
ANALYZE x.tbl;
Query can look like this:
SELECT *
FROM x.tbl
WHERE name = 'y17'
ORDER BY id DESC
LIMIT 1;
--> Total runtime: 5.535 ms
CREATE INDEX tbl_name_idx on x.tbl(name);
--> Total runtime: 1.228 ms
DROP INDEX x.tbl_name_idx;
CREATE INDEX tbl_name_id_idx on x.tbl(name, id);
--> Total runtime: 0.053 ms
DROP INDEX x.tbl_name_id_idx;
CREATE INDEX tbl_name_id_idx on x.tbl(name, id DESC);
--> Total runtime: 0.048 ms
DROP INDEX x.tbl_name_id_idx;
CREATE INDEX tbl_name_idx on x.tbl(name);
CLUSTER x.tbl using tbl_name_idx;
--> Total runtime: 1.144 ms
DROP INDEX x.tbl_name_id_idx;
CREATE INDEX tbl_name_id_idx on x.tbl(name, id DESC);
CLUSTER x.tbl using tbl_name_id_idx;
--> Total runtime: 0.047 ms
Conclusion
With a fitting index, the query performs more than 100x faster.
Top performer is a multicolumn index with the filter column first and the sort column last.
Matching sort order in the index helps a little in this case.
Clustering helps with the simple index, because still many columns have to be read from the table, and these can be found in adjacent blocks after clustering. It doesn't help with the multicolumn index in this case, because only one record has to be fetched from the table.
Read more about multicolumn indexes in the manual.
All of these effects grow with the size of the table. 10000 rows of two tiny columns is just a very small test case.
You can put the query together in Rails and the ORM will write the proper SQL:
Model.where(:name=>"Joe").order('created_at DESC').first
This should not result in retrieving all Model records, nor even a table scan.
This is probably the easiest:
SELECT [columns] FROM [table] WHERE [criteria] ORDER BY [id column] DESC LIMIT 1
Note: Indexing is important here. A huge DB will be slow to search no matter how you do it if you're not indexing the right way.

using SQL aggregate functions with JOINs

I have two tables - tool_downloads and tool_configurations. I am trying to retrieve the most recent build date for each tool in my database. The layout of the DB is simple. One table called tool_downloads keeps track of when a tool is downloaded. Another table is called tool_configurations and stores the actual data about the tool. They are linked together by the tool_conf_id.
If I run the following query which omits dates, I get back 200 records.
SELECT DISTINCT a.tool_conf_id, b.tool_conf_id
FROM tool_downloads a
JOIN tool_configurations b
ON a.tool_conf_id = b.tool_conf_id
ORDER BY a.tool_conf_id
When I try to add in date information I get back hundreds of thousands of records! Here is the query that fails horribly.
SELECT DISTINCT a.tool_conf_id, max(a.configured_date) as config_date, b.configuration_name
FROM tool_downloads a
JOIN tool_configurations b
ON a.tool_conf_id = b.tool_conf_id
ORDER BY a.tool_conf_id
I know the problem has something to do with group-bys/aggregate data and joins. I can't really search google since I don't know the name of the problem I'm encountering. Any help would be appreciated.
Solution is:
SELECT b.tool_conf_id, b.configuration_name, max(a.configured_date) as config_date
FROM tool_downloads a
JOIN tool_configurations b
ON a.tool_conf_id = b.tool_conf_id
GROUP BY b.tool_conf_id, b.configuration_name

Resources