Select records within range of dates in Postgres - ruby-on-rails

Given a table concerts and a column concerts.occurence_dates of Array type.
I need to select concerts which occurs within from_date and to_date using ActiveRecord.
I just figured out how to select events after from_date:
Concert.where(':from_date < ANY(occurrence_dates)', from_date: from_date)

As mentioned in the official documentation
Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.
This also applies to your case, trying to perform a search using specific array elements.
You are using the wrong field. What you really need to use are two separate columns, one for the from_date and the other for the to_date, along with two indexes.
The indexes will ensure a fast lookup, the separate columns will allow you to search passing a range of dates or simply using
Concert.where('from_date >= ? AND to_date <= ?', from_date, to_date)
With the current setup, your best solution is to scan all the records, select each occurrence_dates and check whether they match your criteria.

Possible solution:
Concert.where('(occurrence_dates[0], occurrence_dates[array_length(occurrence_dates, 1)]) OVERLAPS (:from_date, :to_date)', from_date: from_date, to_date: to_date)

Related

Rails: How can I use .to_sql on a .select() request

I have an ActiveRecord request:
Post.all.select { |p| Date.today < p.created_at.weeks_since(2) }
And I want to be able to see what SQL request this produces using .to_sql
The error I get is: NoMethodError: undefined method 'to_sql'
TIA!
ISSUE
There are 2 types of select when it comes to ActiveRecord objects, from the Docs
select with a Block.
First: takes a block so it can be used just like Array#select.
This will build an array of objects from the database for the scope, converting them into an array and iterating through them using Array#select.
This is what you are using right now. This implementation will load every post instantiate a Post object and then iterating over each Post using Array#select to filter the results into an Array. This is highly inefficient, cannot be chained with other AR semantics (e.g. where,order,etc.) and will cause very long lags at scale. (This is also what is causing your error because Array does not have a to_sql method)
select with a list of columns (or a String if you prefer)
Second: Modifies the SELECT statement for the query so that only certain fields are retrieved...
This version is unnecessary in your case as you do not wish to limit the columns returned by the query to posts.
Suggested Resolution:
Instead what you are looking for is a WHERE clause to filter the records at the database level before returning them to the ORM.
Your current filter is (X < Y + 2)
Date.today < p.created_at.weeks_since(2)
which means Today's Date is less than Created At plus 2 Weeks.
We can invert this criteria to make it easier to query by switching this to Today's Date minus 2 weeks is less than Created At. (X - 2 < Y)
Date.today.weeks_ago(2) < p.created_at
This is equivalent to p.created_at > Date.today.weeks_ago(2) which we can convert to a where clause using standard ActiveRecord query methods:
Post.where(created_at: Date.today.weeks_ago(2)...)
This will result in SQL like:
SELECT
posts.*
FROM
posts.*
WHERE
posts.created_at > '2022-10-28'
Notes:
created_at is a TimeStamp so it might be better to use Time.now vs Date.today.
Additional concerns may be involved from a time zone perspective since you will be performing date/time specific comparisons.
You need to call to_sql on a relation. select executes the query and gives you the result, and on the result you don't have to_sql method.
There are similar questions which you can look at as they offer some alternatives.

Speed up Active Record group by count query

How can I speed up the following query? I'm look to find record with 6 or less unique values of fb_id. The select doesn't seem to be adding much in terms of time but instead it's the group and count. Is there an alternate way to query? I added an index on fb_id and it only sped up the query by 50%
FbGroupApplication.group(:fb_id).where.not(
fb_id: _get_exclude_fb_group_ids
).group(
"count_fb_id desc"
).count(
"fb_id"
).select{|k, v| v <= 6 }
The query is looking for FbGroupApplications that have 6 or less applications to the same fb_id
Passing a block to the select method made Rails trigger the SQL, convert the found rows into ActiveRecord::Base's ruby object (record), and then perform a select on the array based of the block you gave. This whole process is costly (ruby is not good at this).
You can "delegate" the responsibility of comparing the count vs 6 to the database with a having clause:
FbGroupApplication
.group(:fb_id)
.where.not(fb_id: _get_exclude_fb_group_ids)
.having('count(fb_id) <= 6')

Rails - Ruby enumerable select by date

In my Expense model I have a date attribute called payment_date. This is a Date format and not DateTime.
In one of my views Im displaying this data in a few different formats. and I want to avoid multiple queries.
For example, right next to Expense.all I need to display expenses year to date. Rather than running two queries to pull essentially the same information, I thought I would try to pluck the YTD data from #expenses = Expense.all.
Right now I'm trying to use:
#expenses.select { |ex| ex.payment_date > Date.today.beginning_of_year }
but this is returning a blank array.
Is it possible to select results by date, and where am i messing up?
To include Jan 1 of this year in your YTD expenses, use >= instead of > in your select block.
Since you tagged this with Rails, an even more performant way to query this is by using ActiveRecord/SQL.
If you have many records, doing #expenses = Expense.all and then using the Ruby enumerable select on that collection will load all of the expenses from the DB into memory. This could be quite slow, or could even cause out-of-memory errors!
You can do (assuming the DB is Postgres):
#ytd_expenses = Expense.where("payment_date >= ?", Date.today.beginning_of_year)
This will only return the results you care about from the DB.

Rails: select unique values from a column

I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")

Problem with sorting by row ( special case )

I have a requirement for sorting Contacts records by primary_contact_no.
My Contact fields contain primary_contact_no ,email , mobile_no.
this is no brainier....
BUT my view requires me to show mobile_no under Contact Number(view label) when primary_contact_no is not present.
Contacts.find(:all, :order => "primary_contact_no")
Now When i sort it by primary_contact , in the view , the records where these fields are absent get replaced with mobile_no but since they are already sorted by contact_no they appear at the bottom of the search result.
How can i combine the two results ( in case primary_contact is not present and carry out search on the combined record )
Is there any other solution to the problem where i can combine the row search records or something like that???
P.S.
I have used will paginate.
You could order once you retrieve them from the database.
So
contacts = Contact.all
u.sort!{|a,b| a.con_number<=> b.con_number}
Then in your Contact Model
def con_number
primary_contact_no||mobile_no
end
MySQL and PostgreSQL both have COALESCE function, so you can do something like:
Contacts.find(:all, :order => "COALESCE(primary_contact_no,mobile_no)")
to sort the records as you want. But beware, using sql functions and raw sql has its caveats. If you decide to switch databases, you have to check if each raw sql and sql function you used like this is supported in your new RDBMSI.
I would not sort the records in my application, as that means, I can not use pagination of will paginate to select limited data and have to retrieve full set of records, sort them and then use the relevant records based on pagination parameters. It will increase the response time consistently as the contacts table grows.

Resources