Order by Nearest using PostGIS, RGeo, Spatial Adapter - ruby-on-rails

I'm asking this question because the answers I've found in Order by nearest - PostGIS, GeoRuby, spatial_adapter wasn't able to provide a solution. I'm trying to create a controller method that can return the n closest records to a certain record's lonlat. The record it is querying against is from the same table. This concept isn't such a big stretch if I was doing this completely in SQL. That much is clear in the linked example and below is a specific case where I obtained a result:
condos_development=#
SELECT id, name FROM condos
ORDER BY ST_Distance(condos.lonlat, ST_PointFromText('POINT(-71.06 42.45)'))
condos_development-#
LIMIT 5;
My problem is in making this work with ActiveRecord. I'm using a method that was inspired by the response by #dc10 but I'm unable to create a working query either through the RGeo methods, or direct SQL. Here's what I have so far:
def find_closest_condos(num, unit)
result = Condo.order('ST_Distance(condos.lonlat, ST_PointFromText("#{unit.lonlat.as_text)}")')
.limit(5)
end
The response from this attempt is as follows:
ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR: syntax error
at or near "LIMIT" 10:29:50 rails.1 LINE 1: ...lonlat,
ST_PointFromText("#{unit.lonlat.as_text)}") LIMIT $1
Would someone be able to set me on the right track on how to put this work query together so that I can make it work in Rails?

The problem is with how active record is resolving your query to SQL, also the position of the Limit clause. If you change the query to this:
Condo.order("ST_Distance(lonlat, ST_GeomFromText('#{unit.lonlat.as_text}', 4326))")
.limit(num)
You should find this works.

Related

Combining distinct with another condition

I'm migrating a Rails 3.2 app to Rails 5.1 (not before time) and I've hit a problem with a where query.
The code that works on Rails 3.2 looks like this,
sales = SalesActivity.select('DISTINCT batch_id').where('salesperson_id = ?', sales_id)
sales.find_each(batch_size: 2000) do |batchToProcess|
.....
When I run this code under Rails 5.1, it appears to cause the following error when it attempts the for_each,
ArgumentError (Primary key not included in the custom select clause):
I want to end up with an array(?) of unique batch_ids for the given salesperson_id that I can then traverse, as was working with Rails 3.2.
For reasons I don't understand, it looks like I might need to include the whole record to traverse through (my thinking being that I need to include the Primary key)?
I'm trying to rephrase the 'where', and have tried the following,
sales = SalesActivity.where(salesperson_id: sales_id).select(:batch_id).distinct
However, the combined ActiveRecordQuery applies the DISTINCT to both the salesperson_id AND the batch_id - that's #FAIL1
Also, because I'm still using a select (to let distinct know which column I want to be 'distinct') it also still only selects the batch_id column of course, which I am trying to avoid - that's #FAIL2
How can I efficiently pull all unique batch_id records for a given salesperson_id, so I can then for_each them?
Thanks!
How about:
SalesActivity.where(salesperson_id: sales_id).pluck('DISTINCT batch_id')
May need to change up the ordering of where and pluck, but pluck should return an array of the batch_ids

column "pg_search_***" must appear in the GROUP BY clause or be used in an aggregate function

Tool.select('tools.id, tools.name').search('f')
the above query works fine but
Tool.select('tools.id, tools.name').group('tools.id').search('f')
produces the error
ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column
"pg_search_3aaef8932e30f4464f664f.pg_search_469c73b9b63bebacc2607f"
must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...xt, '')), 'D') || to_tsvector('english',
coalesce(pg_search_...
I am using pg_search(https://github.com/Casecommons/pg_search) gem for full text search..
I am not able to figure out the reason even tried adding
group("tools.id, tools.*, #{PgSearch::Configuration.alias('tools')}.rank")
As mentioned in the read me but still same error.
What is the proper way to frame the query?
PG requires all selected fields to be either present in the GROUP BY clause or in an aggregate function(eg: max, min) - if you're grouping by anything, of course.
I'm guessing a pg_search gem migration created a field named pg_search_469c73b9b63bebacc2607f on a table named pg_search_3aaef8932e30f4464f664f and that is part of your query, but isn't part of the GROUP BY clause.
I can't see how #{PgSearch::Configuration.alias('tools')}.rank would translate to pg_search_3aaef8932e30f4464f664f.pg_search_469c73b9b63bebacc2607f
Try using group("tools.id, tools.name, pg_search_3aaef8932e30f4464f664f.pg_search_469c73b9b63bebacc2607f") and see if you get rid of that error. If it works, try to find a programatic way of retrieving that pg_search_469c73b9b63bebacc2607f field name.
Given your example, I don't think grouping by tools.id is doing you any good, assuming id is unique. If you were joining with something else it could make sense, though.

tricky union query using ruby on rails/active record

I have
a = Profile.last
a.mailbox.inbox
a.mailbox.sentbox
active_conversations = [IDS OF ACTIVE CONVERSATIONS]
a.mailbox.inbox & active_conversations
returns part of what I need
I want
(a.mailbox.inbox & active_conversations) AND a.mailbox.sentbox
but I need it as SQL, so that I can order it efficiently. I want to order it by ('updated_at')
I have tried joins and other things but they don't work. The classes of (a.mailbox.inboxa and the sentbox are
ActiveRecord::Relation::ActiveRecord_Relation_Conversation
but
(a.mailbox.inbox & active_conversations)
is an array
edit
Something as simple as a.mailbox.inbox JOINS SOMEHOW a.mailbox.sentbox I should be able to work with, but I also can't seem to figure out.
Instead of doing
(a.mailbox.inbox & active_conversations)
you should be able to do
a.mailbox.inbux.where('conversations.id IN (?)', active_conversations)
I believe the Conversation class (and its underlying conversations table) should be right according to the mailboxer code.
However this gives you an ActiveRelation object instead of an array. You can transform this to pure SQL using to_sql. So I think something like this should work:
# get the SQL of both statements
inbox_sql = a.mailbox.inbux.where('conversations.id IN (?)', active_conversations).to_sql
sentbox_sql = a.mailbox.sentbox.to_sql
# use both statements in a UNION SQL statement issued on the Conversation class
Conversation.from("#{inbox_sql} UNION #{sentbox_sql} ORDER BY id AS conversations")

Ascending sort order Index versus descending sort order index when performing OrderBy

I am working on an asp.net mvc web application, and I am using Sql server 2008 R2 + Entity framework.
Now on the sql server I have added a unique index on any column that might be ordered by . for example I have created a unique index on the Sql server on the Tag colum and I have defined that the sort order for the index to be Ascending. Now I have some queries inside my application that order the tag ascending while other queries order the Tag descending, as follow:-
LatestTechnology = tms.Technologies.Where(a=> !a.IsDeleted && a.IsCompleted).OrderByDescending(a => a.Tag).Take(pagesize).ToList(),;
TechnologyList = tms.Technologies.Where(a=> !a.IsDeleted && a.IsCompleted).OrderBy (a => a.Tag).Take(pagesize).ToList();
So my question is whether the two OrderByDescending(a => a.Tag). & OrderBy(a => a.Tag), can benefit from the asending unique index on the sql server on the Tag colum ? or I should define two unique indexes on the sql server one with ascending sort order while the other index with decedning sort order ?
THanks
EDIT
the following query :-
LatestTechnology = tms.Technologies.Where(a=> !a.IsDeleted && a.IsCompleted).OrderByDescending(a => a.Tag).Take(pagesize).ToList();
will generate the following sql statement as mentioned by the sql server profiler :-
SELECT TOP (15)
[Extent1].[TechnologyID] AS [TechnologyID],
[Extent1].[Tag] AS [Tag],
[Extent1].[IsDeleted] AS [IsDeleted],
[Extent1].[timestamp] AS [timestamp],
[Extent1].[TypeID] AS [TypeID],
[Extent1].[StartDate] AS [StartDate],
[Extent1].[IT360ID] AS [IT360ID],
[Extent1].[IsCompleted] AS [IsCompleted]
FROM [dbo].[Technology] AS [Extent1]
WHERE ([Extent1].[IsDeleted] <> cast(1 as bit)) AND ([Extent1].[IsCompleted] = 1)
ORDER BY [Extent1].[Tag] DESC
To answer your question:
So my question is whether the two OrderByDescending(a => a.Tag). &
OrderBy(a => a.Tag), can benefit from the asending unique index on the
sql server on the Tag colum ?
Yes, SQL Server can read an index in both directions: as in index definition or in the exact opposite direction.
However, from your intro I suspect that you still have a wrong impression how indexing works for order by. If you have both, a where clause and an order by clause, you must make sure to have a single index that covers both clauses! It does not help to have on index for the where clause (like on isDeleted and isCompleted — whatever that is in your example) and another index on tag. You need to have a single index that first has the columns of the where clause followed by the columns of the order by clause (multi-column index).
It can be tricky to make it work correctly, but it's worth the effort especially if your are only fetching the first few rows (like in your example).
If it doesn't work out right away, please have a look at this:
http://use-the-index-luke.com/sql/sorting-grouping/indexed-order-by
It is generally best to show the actual SQL query—not the .NET source code—when asking for performance advice. Then I could tell you which index to create exactly. At the moment I'm unsure about isDeleted and isCompleted — are these table columns or expressions that evaluate upon other columns?
EDIT (after you added the SQL query)
There are two ways to make your query work as indexed top-n query:
http://sqlfiddle.com/#!6/260fb/4
The first option is a regular index on the columns from the where clause followed by those from the order by clause. However, as you query uses this filter IsDeleted <> cast(1 as bit) it cannot use the index in a order-preserving way. If, however, you re-phrase the query so that it reads like this IsDeleted = cast(0 as bit) then it works. Please look at the fiddle, I've prepared everything there. Yes, SQL Server could be smart enough to know that, but it seems like it isn't.
I don't know how to tweak EF to produce the query in the above described way, sorry.
However, there is a second option using a so called filtered index — that is an index that only contains a sub-set of the table rows. It's also in the SQL Fiddle. Here it is important that you add the where clause to the index definition in the very same way as it appears in your query.
In both ways it still works if you change DESC to ASC.
The important part is that the execution plan doesn't show a sort operation. You can also verify this in SQL Fiddle (click on 'View execution plan').

Return every nth row from database using ActiveRecord in rails

Ruby 1.9.2 / rails 3.1 / deploy onto heroku --> posgresql
Hi, Once a number of rows relating to an object goes over a certain amount, I wish to pull back every nth row instead. It's simply because the rows are used (in part) to display data for graphing, so once the number of rows returned goes above say 20, it's good to return every second one, and so forth.
This question seemed to point in the right direction:
ActiveRecord Find - Skipping Records or Getting Every Nth Record
Doing a mod on row number makes sense, but using basically:
#widgetstats = self.widgetstats.find(:all,:conditions => 'MOD(ROW_NUMBER(),3) = 0 ')
doesn't work, it returns an error:
PGError: ERROR: window function call requires an OVER clause
And any attempt to solve that with e.g. basing my OVER clause syntax on things I see in the answer on this question:
Row numbering in PostgreSQL
ends in syntax errors and I can't get a result.
Am I missing a more obvious way of efficiently returning every nth task or if I'm on the right track any pointers on the way to go? Obviously returning all the data and fixing it in rails afterwards is possible, but terribly inefficient.
Thank you!
I think you are looking for a query like this one:
SELECT * FROM (SELECT widgetstats.*, row_number() OVER () AS rownum FROM widgetstats ORDER BY id) stats WHERE mod(rownum,3) = 0
This is difficult to build using ActiveRecord, so you might be forced to do something like:
#widgetstats = self.widgetstats.find_by_sql(
%{
SELECT * FROM
(
SELECT widgetstats.*, row_number() OVER () AS rownum FROM widgetstats ORDER BY id
) AS stats
WHERE mod(rownum,3) = 0
}
)
You'll obviously want to change the ordering used and add any WHERE clauses or other modifications to suit your needs.
Were I to solve this, I would either just write the SQL myself, like the SQL that you linked to. You can do this with
my_model.connection.execute('...')
or just get the id numbers and find by id
ids = (1..30).step(2)
my_model.where(id => ids)

Resources