partition PostgreSQL table based on geometry column - geolocation

Here is the table with geometry field :
Table "public.regions"
Column | Type |
-----------+-----------------------+-
id | integer |
parent_id | integer |
level | integer |
name | character varying(55) |
location | geometry |
I have stored the geometry for all continents, countries , states and cities. Since it is a huge table I need to partition the table based on top level location ( i.e continent) to improve the performance .
How can I partition my existing table based on geometry(continent) ? Is it good enough to create inheritance tables named asia, europe, australia ... and insert rows in those tables based on queries with contains ? Will that improve the performance of my queries?
For eg. I am trying to run queries like :
11.562424 48.148679 is some point in Munich
EXPLAIN ANALYZE SELECT id, name,level FROM regions WHERE
Contains((location),(GeomFromText('Point(11.562424 48.148679)')));
This is taking around 500 ms with PG in my computer whereas the same query is taking around 200ms in Oracle.

Related

When working with QuestDB, are symbol columns good for performance for huge amounts of rows each?

When working with regular SQL databases, indexes are useful for fetching a few rows, but not so useful when you are fetching a large amount of data from a table. For example, imagine you have a table with stock valuations of 10 stocks over time:
|------+--------+-------+
| time | stock | value |
|------+--------+-------+
| ... | stock1 | ... |
| ... | stock2 | ... |
| ... | ... | ... |
|------+--------+-------+
As far as I can tell, indexing it by stock (even with an enum/int/foreign key) is usually not very useful in a database like Postgres if you want to get data over a large period of time. You end up with an index spanning a large part of the table, and it ends up being faster for the database to do a sequential scan, for example, to get the average value over the whole dataset for each stock:
SELECT stock, avg(value) FROM stock_values GROUP BY stock
Given that QuestDB is row oriented, I would guess that it would result in better performance to have a separated column for each stock.
So, what schema is recommended in QuestDB for a situation like this? One column for each stock, or would a symbol column for each stock symbol be as good (or good enough) even if there are millions of results for each row?
A column per stock is not easy to achieve in QuestDB. If you create table like this
|----------------------------------|
| time | stock1 | stock1 | stock3 |
|----------------------------------|
Then you'll have to insert all values together in one row or you end up with gaps
|----------------------------------|
| time | stock1 | stock1 | stock3 |
|----------------------------------|
| t1 | 1.1 | | |
| t2 | | 3.45 | |
| t3 | | | 103.45 |
|----------------------------------|
Even for t1 == t2 == t3 when you do the insert as 3 operation it will still result in 3 rows.
So symbols are a better choice here.
Symbol can be indexed and not indexed and you may have benefits of non-indexed symbols when distinct number of them is low. Reading full table vs reading by index is the matter of index selectivity, not data range. If the selectivity is high (e.g. distinct symbol count is say 10k) fetching by index is faster than range scans.

Unexpected behaviour with FireDAC Master-Detail relationships

I face a problem with FireDAC Master-Detail relationships.
FireDAC has two modes for M/D relationships : Parameter-Based and Range-Based http://docwiki.embarcadero.com/RADStudio/Berlin/en/Master-Detail_Relationship_(FireDAC)
The first one uses parameters on every query to retrieve the correspondent details needed after every scroll, and the second one loads first all the data in the datasets, and set the fields that define the master-detail relationships (filtering the details after every scroll on the master).
You can combine both methods, giving you the advantages of both (querys returning limited records while reduced traffic with the database, offline mode, ...).
It works nice and fast except when one of the details is empty. This seems to be the reason (quoted from the Documentation) :
Combining Methods
To combine both methods, an application should use both Parameters and
Range-based setups and include fiDetails into FetchOptions.Cache. Then
FireDAC at first uses range-based M/D. And if a dataset is empty, then
FireDAC uses parameter-based M/D. The new queried records are appended
to the internal records storage.
Also, you can use the TFDDataSet.OnMasterSetValues event handler to override M/D behavior.
Suppose you have
Master BILLS
+---------+------------+
| Bill_Id | Date |
+---------+------------+
| 1 | 01/01/2017 |
+---------+------------+
Detail LINES
+---------+---------+------------+
| Bill_Id | Line_Id | Concept |
+---------+---------+------------+
| 1 | 1 | Television |
| 1 | 2 | Computer |
+---------+---------+------------+
Subdetail TAXES
+---------+---------+-----+--------+
| Bill_Id | Line_Id | Tax | Import |
+---------+---------+-----+--------+
| 1 | 1 | 14% | 74.25 |
| 1 | 1 | 7% | 36.12 |
+---------+---------+-----+--------+
I have those 3 FDQuerys with parameters :
qryBills.SQL = 'select * from BILLS where Bill_Id = :Id';
qryLines.SQL = 'select * from LINES where Bill_Id = :Id';
qryTaxes.SQL = 'select * from TAXES where Bill_Id = :Id';
And the Master-Detail relationship is defined by range
qryLines.MasterFields = 'Bill_Id';
qryTaxes.MasterFields = 'Bill_Id;Line_Id';
If all the details contain records then everything is fine, but when a detail is empty (like in my example, where there are no Taxes for the Line #2) then when I scroll to that empty detail its query is re-launched (as the documentation says) duplicating the records for the not-empty details.
I mean :
I open the three Datasets for the Bill_Id #1
Everything looks fine, I see the master record, the Line #1 and its two taxes
I move to the second line and it still looks fine, the taxes appear empty.
When I go back to the first line, now I see two times its two taxes.
If I go to the second line again, and return to the first one, now I will see three times its two taxes.
...
The problem is that every time I move to the second line, its subdetail is empty, so it relaunches the qryTaxes query, duplicating its entire content.
Is not uncommon to have empty details, do you know of a way to prevent its query to be re-launched when it happens ?. I can't find it.
Thank you.

Rails using Views instead of Tables

I need to create a Rails app that will show/utilize our current CRM system data. The thing is - I could just take Rails and use current DB as backend, but the table names and column names are the exact opposite Rails use.
Table names:
+-------------+----------------+--------------+
| Resource | Expected table | Actual table |
+-------------+----------------+--------------+
| Invoice | invoices | Invoice |
| InvoiceItem | invoice_items | InvItem |
+-------------+----------------+--------------+
Column names:
+-------------+-----------------+---------------+
| Property | Expected column | Actual column |
+-------------+-----------------+---------------+
| ID | id | IniId |
| Invoice ID | invoice_id | IniInvId |
+-------------+-----------------+---------------+
I figured I could use Views to:
Normalize all table names
Normalize all column names
Make it possible to not use column aliases
Make it possible to use scaffolding
But there's a big but:
Doing it on a database level, Rails will probably not be able to build SQL properly
App will probably be read-only, unless I don't use Views and create a different DB instead and sync them eventually
Those disadvantages are probably even worse when you compare it to just plain aliasing.
And so I ask - is Rails able to somehow transparently know the id column is in fact id, but is InvId in the database and vice versa? I'm talking about complete abstraction - simple aliases just don't cut it when using joins etc. as you still need to use the actual DB name.

Ruby on Rails: Join Tables Concept

So I have been out of the coding game for a while and recently decided to pick up rails. I have a question about the concept of Join tables in rails. Specifically:
1) why are these join tables needed in the database?
2) Why can't I just JOIN two tables on the fly like we do in SQL?
A join table allows a clean linking of association between two independent tables. Join tables reduce data duplication while making it easy to find relationships in your data later on.
E.g. if you compare a table called users:
| id | name |
-----------------
| 1 | Sara |
| 2 | John |
| 3 | Anthony |
with a table called languages:
| id| title |
----------------
| 1 | English |
| 2 | French |
| 3 | German |
| 4 | Spanish |
You can see that both truly exist as separate concepts from one another. Neither is subordinate to the other the way a single user may have many orders, (where each order row might store a unique foreign_key representing the user_id of the user that made it).
When a language can have many users, and a user can have many languages -- we need a way to join them.
We can do that by creating a join table, such as user_languages, to store every link between a user and the language(s) that they may speak. With each row containing every matchup between the pairs:
| id | user_id | language_id |
------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 4 |
| 4 | 2 | 1 |
| 5 | 3 | 1 |
With this data we can see that Sara (user_id: 1) is trilingual, while John(user_id: 2) and Anthony(user_id: 3) only speak English.
By creating a join table in-between both tables to store the linkage, we preserve our ability to make powerful queries in relation to data on other tables. For example, with a join table separating users and languages it would now be easy to find every User that speaks English or Spanish or both.
But where join tables get even more powerful is when you add new tables. If in the future we wanted to link languages to a new table called schools, we could simply create a new join table called school_languages. Even better, we can add this join table without needing to make any changes to the languages SQL table itself.
As Rails models, the data relationship between these tables would look like this:
User --> user_languages <-- Language --> school_languages <-- School
By default every school and user would be linked to Language using the same language_id(s)
This is powerful. Because with two join tables (user_languages & school_languages) now referencing the same unique language_id, it will now be easy to write queries about how either relates. For example we could find all schools who speak the language(s) of a user, or find all users who speak the language(s) of a school. As our tables expand, we can ride the joins to find relations about pretty much anything in our data.
tl;dr: Join tables preserve relations between separate concepts, making it easy to make powerful relational queries as you add new tables.

rails user-defined custom columns

I am using Ruby on Rails 4 and MySQL. I have three types. One is Biology, one is Chemistry, and another is Physics. Each type has unique fields. So I created three tables in database, each with unique column names. However, the unique column names may not be known before hand. It will be required for the user to create the column names associated with each type. I don't want to create a serialized hash, because that can become messy. I notice some other systems enable users to create user-defined columns named like column1, column2, etc.
How can I achieve these custom columns in Ruby on Rails and MySQL and still maintain all the ActiveRecord capabilities, e.g. validation, etc?
Well you don't have much options, your best solution is using NO SQL database (at least for those classes).
Lets see how can you work around using SQL. You can have a base Course model with a has_many :attributes association. In which a attribute is just a combination of a key and a value.
# attributes table
| id | key | value |
| 10 | "column1" | "value" |
| 11 | "column1" | "value" |
| 12 | "column1" | "value" |
Its going to be difficult to determin datatypes and queries covering multiple attributes at the same time.

Resources