I have a model named Project and Project has many Tasks
Task can have 3 different status(integer).
I want to get a list of Projects with counts of associated Tasks in status = 1, 2 and 3.
The best i can get to is have a method on Project
def open_tasks
self.tasks.where(:status => 1).count
end
But this will make another SQL for each count and it is very bad performance when loading 100 projects.
Is there a way to get it out in one SQL statement?
I can think of a couple of ways to do this...
(It's not a single sql statement but two, still quite performant though)...
Task.where(status: 1).group(:project_id).count
will give you a hash where the keys are project ids and the values are the task counts. You can then combine this with the list of projects.
You can use the ActiveRecord counter_cache to save in the project records a value for the number of open tasks. ActiveRecord will automatically update this for you. I believe you will need to add an association to the project model like this:
# app/models/project.rb
# needs to include a column called open_task_count
class Project < ActiveRecord::Base
has_many :open_tasks, class_name: Task, -> { where status: 1 }
end
class Task < ActiveRecord::Base
belongs_to :project, counter_cache: true
end
Project.select(
'projects.*',
'(SELECT COUNT(tasks.*) FROM tasks WHERE tasks.project_id = projects.id AND tasks.status = 0) AS status_0_count',
'(SELECT COUNT(tasks.*) FROM tasks WHERE tasks.project_id = projects.id AND tasks.status = 1) AS status_1_count'
).left_joins(:tasks)
Although there are more elegant ways (like lateral joins and CTEs) subqueries work on most DBs. If statuses is an ActiveRecord::Enum you can construct the subqueries by looping over the enum mapping:
class Project < ApplicationRecord
has_many :tasks
def self.with_task_counts
# constucts an array of SQL strings
statuses = Task.statuses.map do |key, int|
sql = Task.select('COUNT(*)')
.where('tasks.project_id = projects.id')
.where(status: key)
.to_sql
"(#{sql}) AS #{key}_tasks_count"
end
select(
'projects.*',
*statuses # * turns the array into a list of args
).left_joins(:tasks)
end
end
In Rails 4 you can still do a LEFT OUTER JOIN by using a SQL string:
class Project
def self.left_joins_tasks(*args)
deprecator = ActiveSupport::Deprecation.new("5.0", "MyApp")
deprecator.deprecation_warning("left_joins_tasks is deprecated, use `.left_joins(:tasks)` instead")
joins('LEFT OUTER JOIN tasks ON tasks.project_id = projects.id')
end
end
Using .joins works as well but gives an INNER join so rows with no tasks are filtered out. You can also use .includes.
I ended up using the counter_culture gem.
https://github.com/magnusvk/counter_culture
Related
So I'm trying to improve the search feature for my app
My model relationships/associations are like so (many>one, one=one):
Clients < Projects < Activities = Assignments = Users
Assignments < Tasks
Tasks table has only a foreign key to assignments.
Search params look something like this:
params[:search]==User: 'user_handle', Client: 'client_name', Project: 'project_name', Activity: 'activity_name'
So I need to porbably search Clients.where().tasks, Projects.where().tasks and so on.
Then I need to somehow concatenate those queries and get rid of all the duplicate results. How to do that in practice however, I have no clue.
I've been hitting my head against a brick wall with this and internet searches didn't really help... so any help is greatly apreciated. Its probably a simple solution too...
I am on rails 4.2.5 sqlite for dev pg for production
A few things I would change/recommend based on the code in your own answer:
Move the search queries into scopes on each model class
Prefer AREL over raw SQL when composing queries (here's a quick
guide)
Enhance rails to use some sort of or when querying Models
The changes I suggest will enable you to do something like this:
search = search_params
tasks = Tasks.all
tasks = tasks.or.user_handle_matches(handle) if (handle = search[:user].presence)
tasks = tasks.or.client_name_matches(name) if (name = search[:client].presence)
tasks = tasks.or.project_name_matches(name) if (name = search[:project].presence)
tasks = tasks.or.activity_name_matches(name) if (name = search[:activity].presence)
#tasks = tasks.uniq
First, convert each of your queries to a scope on your models. This enables you to reuse your scopes later:
class User
scope :handle_matches, ->(handle) {
where(arel_table[:handle].matches("%#{handle}%"))
}
end
class Client
scope :name_matches, ->(name) {
where(arel_table[:name].matches("%#{name}%"))
}
end
class Project
scope :name_matches, ->(name) {
where(arel_table[:name].matches("%#{name}%"))
}
end
class Activity
scope :name_matches, ->(name) {
where(arel_table[:name].matches("%#{name}%"))
}
end
You can then use these scopes on your Task model to allow for better searching capabilities. For each of the scopes on Task we are doing an join (inner join) on a relationship and using the scope to limit the results of the join:
class Task
belongs_to :assignment
has_one :user, :through => :assignment
has_one :activity, :through => :assignment
has_one :project, :through => :activity
scope :user_handle_matches, ->(handle) {
joins(:user).merge( User.handle_matches(handle) )
}
scope :client_name_matches, ->(name) {
joins(:client).merge( Client.name_matches(name) )
}
scope :activity_name_matches, ->(name) {
joins(:activity).merge( Activity.name_matches(name) )
}
scope :project_name_matches, ->(name) {
joins(:project).merge( Project.name_matches(name) )
}
end
The final problem to solve is oring the results. Rails 4 and below don't really allow this out of the box but there are gems and code out there to allow this functionality.
I often include the code in this GitHub gist in an initializer to allow oring of scopes. The code allows you to do things like Person.where(name: 'John').or.where(name: 'Jane').
Many other options are discussed in this SO question.
If you don't want include random code and gems, another option is to pass an array of ids into the where clause. This generates a query similar to SELECT * FROM tasks WHERE id IN (1, 4, 5, ...):
tasks = []
tasks << Tasks.user_handle_matches(handle) if (handle = search[:user].presence)
tasks << tasks.or.client_name_matches(name) if (name = search[:client].presence)
tasks << tasks.or.project_name_matches(name) if (name = search[:project].presence)
tasks << tasks.or.activity_name_matches(name) if (name = search[:activity].presence)
# get the matching id's for each query defined above
# this is the catch, each call to `pluck` is another hit of the db
task_ids = tasks.collect {|query| query.pluck(:id) }
tasks_ids.uniq!
#tasks = Tasks.where(id: tasks_ids)
So I solved it, it is supper sloppy however.
first I wrote a method
def add_res(ar_obj)
ar_obj.each do |o|
res += o.tasks
end
return res
end
then I wrote my search logic like so
if !search_params[:user].empty?
query = add_res(User.where('handle LIKE ?', "%#{search_params[:user]}%"))
#tasks.nil? ? #tasks=query : #tasks=#tasks&query
end
if !search_params[:client].empty?
query = add_res(Client.where('name LIKE ?', "%#{search_params[:client]}%"))
#tasks.nil? ? #tasks=query : #tasks=#tasks&query
end
if !search_params[:project].empty?
query = add_res(Project.where('name LIKE ?', "%#{search_params[:project]}%"))
#tasks.nil? ? #tasks=query : #tasks=#tasks&query
end
if !search_params[:activity].empty?
query = add_res(Activity.where('name LIKE ?', "%#{search_params[:activity]}%"))
#tasks.nil? ? #tasks=query : #tasks=#tasks&query
end
if #tasks.nil?
#tasks=Task.all
end
#tasks=#tasks.uniq
If someone can provide a better answer I would be forever greatful
I came across about the problem excluding data, if the attribute x of one of the associated data has the value 'a'.
Example:
class Order < ActiveRecord::Base
has_many :items
end
class Item < ActiveRecord::Base
belongs_to :order
validate_presence_of :status
end
The query should return all Orders that don't have an Item with status = 'paid' (status != 'paid').
Because of the 1:n association an Order can have many Items. And one of the Itmes can have the status = 'paid'. These Orders must be excluded from the result of my query even if the order has other items with status different from 'paid'.
How would I solve this problem:
paid_items = Items.where(status: 'paid').pluck(:order_id)
orders_wo_paid = Order.where('id NOT IN (?)', paid_items)
Is there an ActiveRecord solution, that solves this problem in one query.
Or are there other ways to solve this question?
I 'm not looking for ruby solution such as:
Order.select do |order|
!order.items.pluck(:status).include?('paid')
end
thx for ideas and inspirations.
You can do:
Order.where('orders.id NOT IN (?)', Item.where(status: 'paid').select(:order_id))
If you're using Rails 4.x then:
Order.where.not(id: Item.where(status: 'paid').select(:order_id))
The query you are interested in is the following, but creating with activerecord will be hard/no very readable:
SELECT
orders.*
FROM
orders
LEFT JOIN
order_items ON orders.id = order_items.order_id
GROUP BY
order_items.order_id
HAVING
COUNT(DISTINCT order_items.id) = COUNT(DISTINCT order_items.status <> 'paid')
Sorry for the sql indentation, I have no idea which are the conventions for it.
A way (not the best one at all) to it with rails (unfortunately writing sql for the most important parts) would be the following:
Order.group(:order_id).joins("LEFT JOIN order_items ON orders.id = order_items.order_id")
.having("COUNT(DISTINCT order_items.id) = COUNT(DISTINCT order_items.status <> 'paid')")
Of course you can play with AREL to get rid of the hard coded sql, but in my opinion it will not be easier to read.
You can have an example of creating lefts joins in this gist: https://gist.github.com/mildmojo/3724189
Need advice, how to write complex query in Ruby.
Query in PHP project:
$get_trustee = db_query("SELECT t.trustee_name,t.secret_key,t.trustee_status,t.created,t.user_id,ui.image from trustees t
left join users u on u.id = t.trustees_id
left join user_info ui on ui.user_id = t.trustees_id
WHERE t.user_id='$user_id' AND trustee_status ='pending'
group by secret_key
ORDER BY t.created DESC")
My guess in Ruby:
get_trustee = Trustee.find_by_sql('SELECT t.trustee_name, t.secret_key, t.trustee_status, t.created, t.user_id, ui.image FROM trustees t
LEFT JOIN users u ON u.id = t.trustees_id
LEFT JOIN user_info ui ON ui.user_id = t.trustees_id
WHERE t.user_id = ? AND
t.trustee_status = ?
GROUP BY secret_key
ORDER BY t.created DESC',
[user_id, 'pending'])
Option 1 (Okay)
Do you mean Ruby with ActiveRecord? Are you using ActiveRecord and/or Rails? #find_by_sql is a method that exists within ActiveRecord. Also it seems like the user table isn't really needed in this query, but maybe you left something out? Either way, I'll included it in my examples. This query would work if you haven't set up your relationships right:
users_trustees = Trustee.
select('trustees.*, ui.image').
joins('LEFT OUTER JOIN users u ON u.id = trustees.trustees_id').
joins('LEFT OUTER JOIN user_info ui ON ui.user_id = t.trustees_id').
where(user_id: user_id, trustee_status: 'pending').
order('t.created DESC')
Also, be aware of a few things with this solution:
I have not found a super elegant way to get the columns from the join tables out of the ActiveRecord objects that get returned. You can access them by users_trustees.each { |u| u['image'] }
This query isn't really THAT complex and ActiveRecord relationships make it much easier to understand and maintain.
I'm assuming you're using a legacy database and that's why your columns are named this way. If I'm wrong and you created these tables for this app, then your life would be much easier (and conventional) with your primary keys being called id and your timestamps being called created_at and updated_at.
Option 2 (Better)
If you set up your ActiveRecord relationships and classes properly, then this query is much easier:
class Trustee < ActiveRecord::Base
self.primary_key = 'trustees_id' # wouldn't be needed if the column was id
has_one :user
has_one :user_info
end
class User < ActiveRecord::Base
belongs_to :trustee, foreign_key: 'trustees_id' # relationship can also go the other way
end
class UserInfo < ActiveRecord::Base
self.table_name = 'user_info'
belongs_to :trustee
end
Your "query" can now be ActiveRecord goodness if performance isn't paramount. The Ruby convention is readability first, reorganizing code later if stuff starts to scale.
Let's say you want to get a trustee's image:
trustee = Trustee.where(trustees_id: 5).first
if trustee
image = trustee.user_info.image
..
end
Or if you want to get all trustee's images:
Trustee.all.collect { |t| t.user_info.try(:image) } # using a #try in case user_info is nil
Option 3 (Best)
It seems like trustee is just a special-case user of some sort. You can use STI if you don't mind restructuring you tables to simplify even further.
This is probably outside of the scope of this question so I'll just link you to the docs on this: http://api.rubyonrails.org/classes/ActiveRecord/Base.html see "Single Table Inheritance". Also see the article that they link to from Martin Fowler (http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html)
Resources
http://guides.rubyonrails.org/association_basics.html
http://guides.rubyonrails.org/active_record_querying.html
Yes, find_by_sql will work, you can try this also:
Trustee.connection.execute('...')
or for generic queries:
ActiveRecord::Base.connection.execute('...')
I have the following models in my Rails application:
class Shift < ActiveRecord::Base
has_many :schedules
scope :active, where(:active => true)
end
class Schedule < ActiveRecord::Base
belongs_to :shift
end
I wish to generate a collection of all active shifts and eager load any associated schedules that have occurs_on between two given dates. If a shift has no schedules between those dates, it should still be returned in the results.
Essentially, I want to generate SQL equivalent to:
SELECT shifts.*, schedules.*
FROM shifts
LEFT JOIN schedules ON schedules.shift_id = shifts.id
AND schedules.occurs_on BETWEEN '01/01/2012' AND '01/31/2012'
WHERE shifts.active = 't';
My first attempt was:
Shift.active.includes(:schedules).where("schedules.occurs_on BETWEEN '01/01/2012' AND '01/31/2012')
The problem is that the occurs_on filtering is done in the where clause, and not in the join. If a shift has no schedules in that period, it is not returned at all.
My second attempt was to use the joins method, but this does an inner join. Again, this will drop all shifts that have no schedules for that period.
I'm frustrated because I know the SQL I want AREL to generate, but I can't figure out how to express it with the API. Anyone?
you could try some pretty raw AREL. Disclaimer: I didn't have actual Schedule and Shift classes so i couldn't test this properly, but i used some existing tables to troubleshoot it on my own machine.
on = Arel::Nodes::On.new(
Arel::Nodes::Equality.new(Schedule.arel_table[:shift_id], Shift.arel_table[:id]).\
and(Arel::Nodes::Between.new(
Schedule.arel_table[:occurs_on],
Arel::Nodes::And.new(2.days.ago, Time.now)
))
)
join = Arel::Nodes::OuterJoin.new(Schedule.arel_table, on)
Shift.joins(join).where(active: true).to_sql
You can use a SQL fragment as the argument of your joins method call :
Shift.active.joins('LEFT OUTER JOIN schedules ON schedules.occurs_on...')
You can construct a raw sql query using Arel as follows:
#start_date
#end_date
#shift = Shift.arel_table
#schedule = Schedule.arel_table
#shift.join(#schedule)
.on(#schedule[:shift_id].eq(#shift[:id])
.and(#schedule[:occurs_on].between(#start_date..#end_date)))
.to_sql
The problem I'm having is like this: The model to sort is SchoolClass which has_many Students which in turn has_many Projects and each project has an end_date. I need to sort the SchoolClasses four ways: First by the earliest project end_date sort ascending and descending, and second by the latest project end_date sort ascending and descending. Does this make sense?
class SchoolClass < ActiveRecord::Base
has_many :students
end
class Student < ActiveRecord::Base
has_many :projects
belongs_to :school_class
end
class Project < ActiveRecord::Base
belongs_to :student
end
The only way I can think of doing it is very brute force and involves having a methods in the SchoolClass model that return the earliest and latest project dates for that instance like so:
students.collect(&:projects).flatten.select(&:end_date).sort.last
to find the latest project end_date for that class and then fetching out all the classes of the database and sorting them by that method. Surely this is just awful though, right? I would really like to find the rails way to get this ordering (with scopes maybe?). I thought something like SchoolClasses.joins(:students).joins(:projects).order('projects.end_date ASC') might work but that will crash rails (and looking at it now the logic is wrong anyway i think).
Any suggestions?
Try this:
scs = SchoolClass.joins({:students => :projects}).
select("school_classes.id,
MIN(projects.end_date) AS earliest_end_date,
MAX(projects.end_date) AS latest_end_date").
group("school_classes.id").
order("earliest_end_date ASC")
The objects in the scs array has following attributes:
id
earliest_end_date
latest_end_date
If you need additional attributes you can do the following
1) Add the additional attributes to the group and select methods
2) Query the full SchoolClass object using the id
3) Rewrite the query to use a nested JOIN
scs = SchoolClass.joins(
"JOIN (
SELECT a.id,
MIN(c.end_date) AS earliest_end_date,
MAX(c.end_date) AS latest_end_date
FROM school_classes a
JOIN students b ON b.class_id = a.id
JOIN projects c ON c.student_id = b.id
GROUP BY a.id
) d ON d.id = school_classes.id
").select("school_classes.*,
d.earliest_end_date AS earliest_end_date,
d.latest_end_date AS latest_end_date").
order("earliest_end_date ASC")