Advanced Rails 3 scope with CROSS JOIN - ruby-on-rails

I want a scope which returns me a list of (eg.) books they have chapters. I found this post.
It helped me to select books they have chapters. But if I want want to select those they have no chapters like:
class Book
scope :long, joins(:chapters).
select('books.id, count(chapters.id) as n_chapters').
group('books.id').
having('n_chapters = 0')
end
This scope returns me nothing. Can you help me out?

Using joins leads to creating relation with all the combinations of book+chapter that already have connection. This is done vie INNER JOIN SQL-clause. You have to make OUTER JOIN if you want all the possible combinations to be created (including book+nullified chapter).
replace
joins(:chapters).
with
joins('LEFT OUTER JOIN chapters ON books.id=chapters.book_id').
But in this case consider using the NOT IN SQL-clause instead. Like here:
scope :long, lambda { where('id NOT IN (%s)' % Chapter.select(:book_id).to_sql) }
It is considerably smaller/faster and more readable/maintainable construction.

Related

Order on a associated column

I has Model that has_many ModelItems. ModelItems belongs_to OtherModel which has a 'code' column.
I am looking to do something like this:
Model.find(1).model_items.includes(:other_model).order("other_model.code" :desc)
I am trying to sort based on the text of that related code column.
I have even tried:
ModelItems.where(model_id: 1).includes(:other_model).order("other_model.code" :desc)
I know I need an include or join here but no matter what I do I get a variation of this error:
PG::UndefinedTable: ERROR: missing FROM-clause entry for table "other_model"
UPDATE
This is an example of not using the real model names bites you. I had it right all along.
The include was singular and the order model name needed to be plural - for clarity lets change other_model to widgit:
Model.find(1).model_items.includes(:widgit).order("widgits.code ASC")
includes in this context will execute 2 queries to create a pseudo outer join (also preload) and is conceptually as follows
SELECT * FROM model_items WHERE model_items.model_id = 1
SELECT * FROM other_models WHERE other_models.model_item_id IN ( [IDS FROM PREVIOUS QUERY])
You can enforce single query execution in a few ways:
eager_load - (ModelItems.eager_load(:other_model)) This is how includes works when you have the join table referenced in a hash finder query condition or when you add references.
references - (ModelItems.includes(:other_model).references(:other_model)) This enforces the eager_load path for include
where Hash finder method (ModelItems.includes(:other_model).where(other_models: {name: 'ABC'})) Here includes intelligently realizes that you have placed a condition on the relationship with other_model and will automatically create the join so that the query is not malformed. However this sort of query represents as an outer join but performs like an inner join which is less efficient *
However if you do not need the information in other_model and just want to use this as a sort mechanism then you can use joins (INNER JOIN) or left_joins (OUTER JOIN) which will allow you to sort this data but will not retrieve the attributes or instantiate any related objects under the other_model relationship
ModelItems.joins(:other_model)
# OR
ModelItems.left_joins(:other_model)
*These options can be combined as well as in the case of the includes where hash finder method I always recommend the following ModelItems.joins(:other_model).includes(:other_model).where(other_models: { name: 'ABC'}) (INNER JOIN). This will return the same data set as ModelItems.includes(:other_model).where(other_models: {name: 'ABC'}) (LEFT OUTER JOIN) however by utilizing an INNER JOIN it becomes more efficient than its LEFT OUTER JOIN version
Sidenote order("other_models.code" :desc) this is not valid. Instead you need to include the order direction in the String or make that String and Symbol e.g. (("other_models.code DESC") or ("other_models.code": :desc))
Add references
ModelItems
.where(model_id: 1)
.includes(:other_model)
.references(:other_model)
.order("other_model.code DESC")

Find all records that don't have any of an associated model

I'm using Rails 3.2.
I have a product model, and a variant model. A product can have many variants. A variant can belong to many products.
I want to make a lookup on the Products model, to find only products that have a specific variant count, like such (pseudocode):
Product.where("Product.variants.count == 0")
How do you do this with activerecord?
You can use a LEFT OUTER JOIN to return the records that you need. Rails issues a LEFT OUTER JOIN when you use includes.
For example:
Product.includes(:variants).where('variants.id' => nil)
That will return all products where there are no variants. You can also use an explicit joins.
Product.joins('LEFT OUTER JOIN variants ON variants.product_id = products.id').where('variants.id' => nil)
The LEFT OUTER JOIN will return records on the left side of the join, even if the right side is not present. It will place null values into the associated columns, which you can then use to check negative presence, as I did above. You can read more about left joins here: http://www.w3schools.com/sql/sql_join_left.asp.
The good thing about this solution is that you're not doing subqueries as a conditional, which will most likely be more performant.
products= Product.find(:all,:select => 'variant').select{|product| product.varients.count > 10}
This is rails 2.3 , but only the activeRecord part, you need to see the select part
I don't know of any ActiveRecord way to do this but the following should help with your problem. The good thing about this solution is that everything's done on the db side.
Product.where('(SELECT COUNT(*) FROM variants WHERE variants.product_id = products.id) > 0')
If you want to pull products which have a specific non-0 number of variants, you could do that with something like this (admittedly untested):
Product.select('product.id, product.attr1_of_interest, ... product.attrN_of_interest, variant.id, COUNT(*)')
.joins('variants ON product.id = variants.product_id')
.group('product.id, product.attr1_of_interest, ... product.attrN_of_interest, variant.id')
.having('COUNT(*) = 5') #(or whatever number manipulation you want to do here)
If you want to allow for 0 products, you would have to use Sean's solution above.

Using Named Scopes as Subqueries in Rails

In my Rails app, I want to join one table with a named scope of another table. Is there a way to do this without having to rewrite the named scope in pure SQL for my join statement?
Basically, is there a way to do something like this?
class Foo < ActiveRecord::Base
scope :updated_today, where('updated_at > ?', DateTime.now.prev_day)
end
Bar.joins(Foo.updated_today)
Where Bar.joins generates the following SQL:
SELECT * FROM bars
INNER JOIN
(SELECT * FROM foos WHERE updated_at > 2012-8-9) AS t0
ON bar_id = bars.id
I don't believe there's any method specifically designed for doing this. You can, however, use the to_sql method of ActiveRecord::Relation to get the full SQL Query for the scope as a string, which you can then use as a subquery in a join statement, like so:
Bar.joins("INNER JOIN (#{Foo.updated_today.to_sql}) as t0 ON bar_id = bars.id")
You can use the merge method to join scopes to another model's query:
Bar.joins(:foos).merge(Foo.updated_today)
I haven't seen a ton of documentation on this (the Rails API doesn't even have any documentation on the method itself), but here is a pretty decent blog post giving a reasonably detailed example.
Also, just noticed that this is mentioned in a RailsCast on Advanced Queries in Rails 3.

Fastest way to order by matching has many through association?

When using a has many association to manage a serious of tags, what is the most efficient way to order/sort the collection by the number of tags selected.
For example:
Product can have many tags through ProductTags
When a user selects the tags, I would like to order the products by the number of the selected tags each product has.
Is it possible to use a cache_counter or something similar in this case? I'm not convinced using sort is the best option. Am I correct in thinking that using order on the actual database is generally faster than sort?
Clarification/update
Sorry if the above is confusing. Basically what I'm after is closer to ordering by relevancy. For example a user might select tag 1, 2, and 4. If an product has all tree tags associated with it, I want that product listed first. The second product might only have tags 1 & 4. And so on. I'm almost certain that this will have to use sort versus order, but was wondering if anyone has found a more efficient way of doing this.
Ordering by relevance within the database is both possible and far more efficient than using the sort method in Ruby. Assuming the following model structure and an appropriate underlying SQL table structure:
class Product < ActiveRecord::Base
has_many :product_taggings
has_many :product_tags, :through => :product_taggings
end
class ProductTags < ActiveRecord::Base
has_many :product_taggings
has_many :products, :through => :product_taggings
end
class ProductTaggings < ActiveRecord::Base
belongs_to :product
belongs_to :product_tags
end
Querying for relevance in MySQL would look something like:
SELECT
`product_id`
,COUNT(*) AS relevance
FROM
`product_taggings` AS ptj
LEFT JOIN
`products` AS p
ON p.`id` = ptj.`product_id`
LEFT JOIN
`product_tags` AS pt
ON pt.`id` = ptj.`product_tag_id`
WHERE
pt.`name` IN ('Tag 1', 'Tag 2')
GROUP BY
`product_id`
If I have the following products and related tags:
Product 1 -> Tag 3
Product 2 -> Tag 1, Tag 2
Product 3 -> Tag 1, Tag 3
Then the WHERE clause from above should net me:
product_id | relevance
----------------------
2 | 2
3 | 1
* Product 1 is not included since there were no matches.
Given that the user is performing a filtered search,
this behavior is probably fine. There's a way to get
Product 1 into the results with 0 relevance if
necessary.
What you've done is create a nice little result set that can act as a sort of inline join table. In order to stick a relevance score onto each row of a query from your products table, use this query as a subquery as follows:
SELECT *
FROM
`products` AS p
,(SELECT
`product_id`
,COUNT(*) AS relevance
FROM
`product_taggings` AS ptj
LEFT JOIN
`products` AS p
ON p.`id` = ptj.`product_id`
LEFT JOIN
`product_tags` AS pt
ON pt.`id` = ptj.`product_tag_id`
WHERE
pt.`name` IN ('Tag 1', 'Tag 2')
GROUP BY `product_id`
) AS r
WHERE
p.`id` = r.`product_id`
ORDER BY
r.`relevance` DESC
What you'll have is a result set containing the fields from your products table and an additional relevance column at the end that will then be used in the ORDER BY clause.
You'll need to write up a method that will in-fill this query with your desired pt.name IN list. Be certain to sanitize that list before plugging it into the query or you'll open yourself up to possible SQL injection.
Take the result of your query assembling method and run it through Product.find_by_sql(my_relevance_sql) to get your models pre-sorted by relevance directly from the DB.
The obvious down-side is that you introduce a DBMS-specific dependency into your Rails code (and risk SQL injection if you're not careful). If you're not using MySQL, the syntax might need to be adapted. However, it should perform much faster, especially on a huge result set, than using a Ruby sort on the results. Furthermore, adding a LIMIT clause will give you pagination support if needed.
Building on Ryan's excellent answer, I wanted a method that could be used acts-as-taggable-on and similar plug-ins (tables called tags/taggings), and ended up with this:
def Product.find_by_tag_list(tag_list)
tag_list_sql = "'" + tag_list.join("','") + "'"
Product.find_by_sql("SELECT * FROM products, (SELECT taggable_id, COUNT(*) AS relevance FROM taggings LEFT JOIN tags ON tags.id = taggings.tag_id WHERE tags.name IN (" + tag_list_sql + ") GROUP BY taggable_id) AS r WHERE products.id = r.taggable_id ORDER BY r.relevance DESC;")
end
To get a list of related products ordered by relevance, I then can do:
Product.find_by_tag_list(my_product.tag_list)

rails select and include

Can anyone explain this?
Project.includes([:user, :company])
This executes 3 queries, one to fetch projects, one to fetch users for those projects and one to fetch companies.
Project.select("name").includes([:user, :company])
This executes 3 queries, and completely ignores the select bit.
Project.select("user.name").includes([:user, :company])
This executes 1 query with proper left joins. And still completely ignores the select.
It would seem to me that rails ignores select with includes. Ok fine, but why when I put a related model in select does it switch from issuing 3 queries to issuing 1 query?
Note that the 1 query is what I want, I just can't imagine this is the right way to get it nor why it works, but I'm not sure how else to get the results in one query (.joins seems to only use INNER JOIN which I do not in fact want, and when I manually specifcy the join conditions to .joins the search gem we're using freaks out as it tries to re-add joins with the same name).
I had the same problem with select and includes.
For eager loading of associated models I used native Rails scope 'preload' http://apidock.com/rails/ActiveRecord/QueryMethods/preload
It provides eager load without skipping of 'select' at scopes chain.
I found it here https://github.com/rails/rails/pull/2303#issuecomment-3889821
Hope this tip will be helpful for someone as it was helpful for me.
Allright so here's what I came up with...
.joins("LEFT JOIN companies companies2 ON companies2.id = projects.company_id LEFT JOIN project_types project_types2 ON project_types2.id = projects.project_type_id LEFT JOIN users users2 ON users2.id = projects.user_id") \
.select("six, fields, I, want")
Works, pain in the butt but it gets me just the data I need in one query. The only lousy part is I have to give everything a model2 alias since we're using meta_search, which seems to not be able to figure out that a table is already joined when you specify your own join conditions.
Rails has always ignored the select argument(s) when using include or includes. If you want to use your select argument then use joins instead.
You might be having a problem with the query gem you're talking about but you can also include sql fragments using the joins method.
Project.select("name").joins(['some sql fragement for users', 'left join companies c on c.id = projects.company_id'])
I don't know your schema so i'd have to guess at the exact relationships but this should get you started.
I might be totally missing something here but select and include are not a part of ActiveRecord. The usual way to do what you're trying to do is like this:
Project.find(:all, :select => "users.name", :include => [:user, :company], :joins => "LEFT JOIN users on projects.user_id = users.id")
Take a look at the api documentation for more examples. Occasionally I've had to go manual and use find_by_sql:
Project.find_by_sql("select users.name from projects left join users on projects.user_id = users.id")
Hopefully this will point you in the right direction.
I wanted that functionality myself,so please use it.
Include this method in your class
#ACCEPTS args in string format "ASSOCIATION_NAME:COLUMN_NAME-COLUMN_NAME"
def self.includes_with_select(*m)
association_arr = []
m.each do |part|
parts = part.split(':')
association = parts[0].to_sym
select_columns = parts[1].split('-')
association_macro = (self.reflect_on_association(association).macro)
association_arr << association.to_sym
class_name = self.reflect_on_association(association).class_name
self.send(association_macro, association, -> {select *select_columns}, class_name: "#{class_name.to_sym}")
end
self.includes(*association_arr)
end
And you will be able to call like: Contract.includes_with_select('user:id-name-status', 'confirmation:confirmed-id'), and it will select those specified columns.
The preload solution doesn't seem to do the same JOINs as eager_load and includes, so to get the best of all worlds I also wrote my own, and released it as a part of a data-related gem I maintain, The Brick.
By overriding ActiveRecord::Associations::JoinDependency.apply_column_aliases() like this then when you add a .select(...) then it can act as a filter to choose which column aliases get built out.
With gem 'brick' loaded, in order to enable this selective behaviour, add the special column name :_brick_eager_load as the first entry in your .select(...), which turns on the filtering of columns while the aliases are being built out. Here's an example:
Employee.includes(orders: :order_details)
.references(orders: :order_details)
.select(:_brick_eager_load,
'employees.first_name', 'orders.order_date', 'order_details.product_id')
Because foreign keys are essential to have everything be properly associated, they are automatically added, so you do not need to include them in your select list.
Hope it can save you both query time and some RAM!

Resources