How do I combine results from two queries on the same model? - ruby-on-rails

I need to return exactly ten records for use in a view. I have a highly restrictive query I'd like to use, but I want a less restrictive query in place to fill in the results in case the first query doesn't yield ten results.
Just playing around for a few minutes, and this is what I came up with, but it doesn't work. I think it doesn't work because merge is meant for combining queries on different models, but I could be wrong.
class Article < ActiveRecord::Base
...
def self.listed_articles
Article.published.order('created_at DESC').limit(25).where('listed = ?', true)
end
def self.rescue_articles
Article.published.order('created_at DESC').where('listed != ?', true).limit(10)
end
def self.current
Article.rescue_articles.merge(Article.listed_articles).limit(10)
end
...
end
Looking in console, this forces the restrictions in listed_articles on the query in rescue_articles, showing something like:
Article Load (0.2ms) SELECT `articles`.* FROM `articles` WHERE (published = 1) AND (listed = 1) AND (listed != 1) ORDER BY created_at DESC LIMIT 4
Article Load (0.2ms) SELECT `articles`.* FROM `articles` WHERE (published = 1) AND (listed = 1) AND (listed != 1) ORDER BY created_at DESC LIMIT 6 OFFSET 4
I'm sure there's some ridiculously easy method I'm missing in the documentation, but I haven't found it yet.
EDIT:
What I want to do is return all the articles where listed is true out of the twenty-five most recent articles. If that doesn't get me ten articles, I'd like to add enough articles from the most recent articles where listed is not true to get my full ten articles.
EDIT #2:
In other words, the merge method seems to string the queries together to make one long query instead of merging the results. I need the top ten results of the two queries (prioritizing listed articles), not one long query.

with your initial code:
You can join two arrays using + then get first 10 results:
def self.current
(Article.listed_articles + Article.rescue_articles)[0..9]
end
I suppose a really dirty way of doing it would be:
def self.current
oldest_accepted = Article.published.order('created_at DESC').limit(25).last
Artcile.published.where(['created_at > ?', oldest_accepted.created_at]).order('listed DESC').limit(10)
end

If you want an ActiveRecord::Relation object instead of an Array, you can use:
ActiveRecordUnion gem.
Install gem: gem install active_record_union and use:
def self.current
Article.rescue_articles.union(Article.listed_articles).limit(10)
end
UnionScope module.
Create module UnionScope (lib/active_record/union_scope.rb).
module ActiveRecord::UnionScope
def self.included(base)
base.send :extend, ClassMethods
end
module ClassMethods
def union_scope(*scopes)
id_column = "#{table_name}.id"
if (sub_query = scopes.reject { |sc| sc.count == 0 }.map { |s| "(#{s.select(id_column).to_sql})" }.join(" UNION ")).present?
where "#{id_column} IN (#{sub_query})"
else
none
end
end
end
end
Then call it in your Article model.
class Article < ActiveRecord::Base
include ActiveRecord::UnionScope
...
def self.current
union_scope(Article.rescue_articles, Article.listed_articles).limit(10)
end
...
end

All you need to do is sum the queries:
result1 = Model.where(condition)
result2 = Model.where(another_condition)
# your final result
result = result1 + result2

I think you can do all of this in one query:
Article.published.order('listed ASC, created_at DESC').limit(10)
I may have the sort order wrong on the listed column, but in essence this should work. You'll get any listed items first, sorted by created_at DESC, then non-listed items.

Related

Does splitting up an active record query over 2 methods hit the database twice?

I have a database query where I want to get an array of Users that are distinct for the set:
#range is a predefinded date range
#shift_list is a list of filtered shifts
def listing
Shift
.where(date: #range, shiftname: #shift_list)
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
and I read somewhere that for readability, or isolating for testing, or code reuse, you could split this into seperate methods:
def listing
shiftlist
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
def shift_list
Shift
.where(date: #range, shiftname: #shift_list)
end
So I rewrote this and some other code, and now the page takes 4 times as long to load.
My question is, does this type of method splitting cause the database to be hit twice? Or is it something that I did elsewhere?
And I'd love a suggestion to improve the efficiency of this code.
Further to the need to remove mapping from the code, this shift list is being created with the following code:
def _month_shift_list
Shift
.select(:shiftname)
.distinct
.where(date: #range)
.map {|x| x.shiftname }
end
My intention is to create an array of shiftnames as strings.
I am obviously missing some key understanding in database access, as this method is clearly creating part of the problem.
And I think I have found the solution to this with the following:
def month_shift_list
Shift.
.where(date: #range)
.pluck(:shiftname)
.uniq
end
Nope, the database will not be hit twice. The queries in both methods are lazy loaded. The issue you have with the slow page load times is because the map function now has to do multiple finds which translates to multiple SELECT from the DB. You can re-write your query to this:
def listing
User.
joins(:shift).
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
This has just one hit to the DB and will be much faster and should produce the same result as above.
The assumption here is that there is a has_one/has_many relationship on the User model for Shifts
class User < ActiveRecord::Base
has_one :shift
end
If you don't want to establish the has_one/has_many relationship on User, you can re-write it to:
def listing
User.
joins("INNER JOIN shifts on shifts.user_id = users.id").
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
ALTERNATIVE:
You can use 2 queries if you experience issues with using ActiveRecord#merge.
def listing
user_ids = Shift.where(date: #range, shiftname: #shift_list).uniq.pluck(:user_id).sort
User.find(user_ids)
end

ActiveRecord select with OR and exclusive limit

I have the need to query the database and retrieve the last 10 objects that are either active or declined. We use the following:
User.where(status: [:active, :declined]).limit(10)
Now we need to get the last 10 of each status (total of 20 users)
I've tried the following:
User.where(status: :active).limit(10).or(User.where(status: : declined).limit(10))
# SELECT "users".* FROM "users" WHERE ("users"."status" = $1 OR "users"."status" = $2) LIMIT $3
This does the same as the previous query and returns only 10 users, of mixed statuses.
How can I get the last 10 active users and the last 10 declined users with a single query?
I'm not sure that SQL allows doing what you want. First thing I would try would be to use a subquery, something like this:
class User < ApplicationRecord
scope :active, -> { where status: :active }
scope :declined, -> { where status: :declined }
scope :last_active_or_declined, -> {
where(id: active.limit(10).pluck(:id))
.or(where(id: declined.limit(10).pluck(:id))
}
end
Then somewhere else you could just do
User.last_active_or_declined()
What this does is to perform 2 different subqueries asking separately for each of the group of users and then getting the ones in the propper group ids. I would say you could even forget about the pluck(:id) parts since ActiveRecord is smart enough to add the proper select clause to your SQL, but I'm not 100% sure and I don't have any Rails project at hand where I can try this.
limit is not a permitted value for #or relationship. If you check the Rails code, the Error raised come from here:
def or!(other) # :nodoc:
incompatible_values = structurally_incompatible_values_for_or(other)
unless incompatible_values.empty?
raise ArgumentError, "Relation passed to #or must be structurally compatible. Incompatible values: #{incompatible_values}"
end
# more code
end
You can check which methods are restricted further down in the code here:
STRUCTURAL_OR_METHODS = Relation::VALUE_METHODS - [:extending, :where, :having, :unscope, :references]
def structurally_incompatible_values_for_or(other)
STRUCTURAL_OR_METHODS.reject do |method|
get_value(method) == other.get_value(method)
end
end
You can see in the Relation class here that limit is restricted:
SINGLE_VALUE_METHODS = [:limit, :offset, :lock, :readonly, :reordering,
:reverse_order, :distinct, :create_with, :skip_query_cache,
:skip_preloading]
So you will have to resort to raw SQL I'm afraid
I don't think you can do it with a single query, but you can do it with two queries, get the record ids, and then build a query using those record ids.
It's not ideal but as you're just plucking ids the impact isn't too bad.
user_ids = User.where(status: :active).limit(10).pluck(:id) + User.where(status: :declined).limit(10).pluck(id)
users = User.where(id: user_ids)
I think you can use UNION. Install active_record_union and replace or with union:
User.where(status: :active).limit(10).union(User.where(status: :declined).limit(10))

How to use find_by_sql properly?

I have the following model
class Backup < ActiveRecord::Base
belongs_to :component
belongs_to :backup_medium
def self.search(value)
join_tables = "backups, components, backup_media"
joins = "backups.backup_medium_id = backup_media.id and components.id = backups.component_id"
c = Backup.find_by_sql "select * from #{join_tables} where components.name like '%#{value}%' and #{joins}"
b = Backup.find_by_sql "select * from #{join_tables} where backup_media.name like '%#{value}%' and #{joins}"
c.count > 0 ? c : b
end
end
In pry, when I run Backup.all.class, I get
=> Backup::ActiveRecord_Relation
but when I run Backup.search('xxx').class, I get
=> Array
Since the search should return a subset of all, I think I need to return an Active Record_Relation. What am I missing?
From the documentation:
Executes a custom SQL query against your database and returns all the
results. The results will be returned as an array with columns
requested encapsulated as attributes of the model you call this method
from. If you call Product.find_by_sql then the results will be
returned in a Product object with the attributes you specified in the
SQL query.
So you will get an array of Backup instances.
Note that you probably should not do it this way. Using string interpolation in a query opens you up to SQL injection attacks and gains you nothing. Also, you can get quite a bit more flexibility using ActiveRecord scopes for this.
def self.my_includes
includes(:components, :backup_media)
end
def self.by_component_name(name)
media_includes.where("components.name like ?", "'%#{name}%'")
end
def self.by_media_name(name)
media_includes.where("backup_media.name like ?", "'%#{value}%'")
end
def self.search(name)
by_component(name).any? ? by_component_name : by_media_name
end
You can then call
Backup.search(name)
as well as
Backup.by_component_name(name)
or
Backup.by_media_name(name)
find_by_sql returns an array of objects, not a Relation. If you want to return relation for consistency try to rewrite your search to use ActiveRecord api:
def self.search(value)
query = Backup.includes(:component, :backup_medium)
by_component_name = query.where("components.name like ?", "'%#{value}%'")
by_media_name = query.where("backup_media.name like ?", "'%#{value}%'")
by_component_name.any? ? by_component_name : by_media_name
end
or, if you still want to use sql, you can try to fetch record ids and then make a second query:
def self.search(value)
# ...
c = Backup.find_by_sql "select id from #{join_tables} where components.name like '%#{value}%' and #{joins}"
b = Backup.find_by_sql "select id from #{join_tables} where backup_media.name like '%#{value}%' and #{joins}"
ids = c.count > 0 ? c : b
Backup.where(id: ids)
end
So I am unable to get the syntax right for the media_includes, but inspired by your solution I have succeeded by using joins.
I created a small demo project which just shows the code related to search. You can take a look at https://github.com/pamh09/rails-search-demo. If you want to collaborate on a solution, I think this would be more efficient than trying to paste all the code here. That said, I do have a working solution if you'd rather not bother. But I would like to see what the right syntax is.
Below is the model code. It's very possible that I just have some kind of syntactic mismatch since I am not very familiar with how rails does its database magic (obviously).
class Backup < ApplicationRecord
belongs_to :component
belongs_to :backup_medium
#---- code below does not work ---
# in pry
# pry(Backup):1> by_media('bak').any?
# (0.0ms) SELECT COUNT(*) FROM "backups" WHERE (backup_media = 'bak')
# ActiveRecord::StatementInvalid: SQLite3::SQLException: no such column: backup_media.name: SELECT COUNT(*) FROM "backups" WHERE (backup_media.name = 'bak')
def self.my_includes
includes(:component, :backup_medium)
end
def self.by_component(name)
my_includes.where("components.name = ?", name)
end
def self.by_media(name)
my_includes.where("backup_media.name = ?", name)
end
def self.search_by(name)
by_component(name).any? ? by_component_name : by_media_name
end
# ----- code below works ... call search('string') -----
# I was unable to get the like query to work without using #{name}
def self.by_component_like(name)
# Note: joins (singular).where (plural.column ...)
joins(:component).where("components.name like '%#{name}%'")
end
def self.by_media_like(name)
joins(:backup_medium).where("backup_media.name like '%#{name}%'")
end
def self.search(name)
by_component_like(name).any? ? by_component_like(name) : by_media_like(name)
end
end
And, as noted in the code. I could not figure you how to use the ? with LIKE as the query would come in as LIKE '%'xxx'%' instead of '%xxx%'.

find_each with order and limit

I need to limit and order batches of records and am using find_each. I've seen a lot of people asking for this and no really good solution. If I've missed it, please post a link!
I have 30M records and want to deal with 10M with the highest value in the weight column.
I tried using this method someone wrote: find_each_with_order but can't get it to work.
The code from that site doesn't take order as an option. Seems strange given that the name is find_each_with_order. I added it as follows:
class ActiveRecord::Base
# normal find_each does not use given order but uses id asc
def self.find_each_with_order(options={})
raise "offset is not yet supported" if options[:offset]
page = 1
limit = options[:limit] || 1000
order = options[:order] || 'id asc'
loop do
offset = (page-1) * limit
batch = find(:all, options.merge(:limit=>limit, :offset=>offset, :order=>order))
page += 1
batch.each{|x| yield x }
break if batch.size < limit
end
end
and I'm trying to use it as follows:
class GetStuff
def self.grab_em
file = File.open("1000 things.txt", "w")
rels = Thing.find_each_with_order({:limit=>100, :order=>"weight desc"})
binding.pry
things.each do |t|
binding.pry
file.write("#{t.name} #{t.id} #{t.weight}\n" )
if t.id % 20 == 0
puts t.id.to_s
end
end
file.close
end
end
BTW I have the data in postgres and am going to grab a subset and move it to neo4j, so I'm tagging with neo4j in case any of you neo4j people know how to do this. thanks.
Not exactly sure if this is what you're looking for, but you can do something like this:
weight = Thing.order(:weight).select(:weight).last(10_000_000).first.weight
Thing.where("weight > ?", weight).find_each do |t|
...your code...
end

Rails 3 Counting Records by Date

I am trying to build an array that looks like this via a model method:
[['3/25/13', 2], ['3/26/13', 1], ['3/27/13', 2]]
Where, the dates are strings and the numbers after them are the count of an table/object.
I have the following model method right now:
def self.weekly_count_array
counts = count(group: "date(#{table_name}.created_at)", conditions: { created_at: 1.month.ago.to_date..Date.today }, order: "date(#{table_name}.created_at) DESC")
(1.week.ago.to_date).upto(Date.today) do |x|
counts[x.to_s] ||= 0
end
counts.sort
end
However, it doesn't return the count accurately (all values are zero). There seem to be some similar questions on SO that I've checked out, but can't seem to get them to work either.
Can someone help (1) let me know if this is the best way to do it, and (2) provide some guidance in terms of what the problem might be with the above code, if so? Thanks!
Use this as a template if you wish
def self.period_count_array(from = (Date.today-1.month).beginning_of_day,to = Date.today.end_of_day)
where(created_at: from..to).group('date(created_at)').count
end
This will return you a hash with dates as key and the count as value. (Rails 3.2.x)
maybe this is what you are trying to do?
class YourActiveRecordModel < ActiveRecord::Base
def.self weekly_count_array
records = self.select("COUNT(id) AS record_count, DATE(created_at) AS created")
.group("DATE(created_at)")
.where("created_at >= ?", 1.month.ago.to_date)
.where("created_at <= ?", Date.current)
records.each do |x|
puts x.record_count
puts x.created # 2013-03-14
# use I18n.localize(x.created, format: :your_format)
# where :your_format is defined in config/locales/en.yml (or other .yml)
end
end
end
Fantastic answer by #Aditya Sanghi.
If you have the exact requirement, you can opt:
def self.weekly_count_array
records = select('DATE(created_at) created_at, count(id) as id').group('created_at')
1.week.ago.to_date.upto(Date.today).map do |d|
[d, records.where('DATE(created_at) = ?', d.to_date).first.try(:id) || 0]
end
end
You do not need a process to perform the count. Simply perform a query for this.
def self.weekly_count_array
select("created_at, COUNT(created_at) AS count")
where(created_at: 1.month.ago.to_date..Date.today)
group("created_at")
order("created_at DESC")
end
Built on #kiddorails Answer,
so not to make a lot of Requests to the DataBase, Created a Hash from the ActiveRecord
& changed the group from .group('created_at') to .group('DATE(created_at)') to base it on date
def self.weekly_count_array
# records = select('DATE(created_at) created_at, count(id) as id').group('created_at')
records_hash = Hash[Download.select('DATE(created_at) created_at, count(id) as id').group('DATE(created_at)').map{|d|[d.created_at, d.id] }]
1.month.ago.to_date.upto(Date.today).map do |d|
[ d, records_hash[d.to_date] || 0 ]
end
end

Resources