Is there any diferrence between includes(:associations).references(:associations) and eager_load(:associations) in Ruby on Rails 5? - ruby-on-rails

It seems includes(:associations).references(:associations) and eager_load(:associations) execute exactly the same SQL (LEFT OUTER JOIN) in Rails 5. So when do I need to use includes(:associations).references(:associations) syntax?
For example,
Parent.includes(:children1, :children2).references(:children1).where('(some conditions of children1)')
can be converted to
Parent.eager_load(:children1).preload(:children2).where('(some conditions of children1)')
I think the latter (query using eager_load and preload) is simpler and looks better.
UPDATE
I found a strange behavior in my environment (rails 5.2.4.3).
Even when I includes several associations and references only one of them, all the associations I included are LEFT OUTER JOINed.
For example,
Parent.includes(:c1, :c2, :c3).references(:c1).to_sql
executes a SQL which LEFT OUTER JOINs all of c1, c2, c3.
I thought it joins only c1.

Indeed, includes + references ends up being the same as eager_load. Like many things in Rails, you have a few ways of accomplishing the same result, and here you are witnessing that first hand. If I were writing them in a single statement, I would always prefer eager_load because it is more explicit and it is a single function call.
I would also prefer eager_load because I consider references a kind of hack. It says to the SQL generator "Hey, I am referring to this object in a way that you would not otherwise detect, so include it in the JOIN statement" and is generally used when you use a String to pass a SQL fragment as part of the query.
The only time I would use the includes(:associations).references(:associations) syntax is when it was an artifact needed to make the query work and not a statement of intent. The Rails Guide gives this good example:
Article.includes(:comments).where("comments.visible = true").references(:comments)
As for why referencing 1 association causes 3 to be JOIN'ed, I do not know for sure. The code for includes uses heuristics to decide when it is faster to use a JOIN and when it is faster to use 2 separate queries, the first to retrieve the parent and the second to retrieve the associated objects. I was surprised to find how often it is faster to use 2 separate queries. It may be that since the query has to use 1 join anyway, the algorithm figures it will be faster to use 1 big join rather than 3 queries, or it may in general think 1 join is faster than 4 queries.
I would not in general use preload unless I had a strong reason to believe that it was faster than join. I would just use includes alone and let the algorithm decide.

Related

ActiveRecord: How to find parents whose ALL children match a condition?

Suppose I have a Parent model that has many Child, and that Child also belongs to OtherParent.
How can i find all Parent where ALL of its Child belongs to any OtherParent?
In pure SQL I could do
Parent.find_by_sql(<<SQL)
SELECT *
FROM parents p
WHERE NOT EXISTS (
SELECT *
FROM children
WHERE parent_id = p.id
AND other_parent_id IS NULL
)
SQL
(from here), but I'd prefer to do it by taking advantage of ActiveRecord if possible.
Thanks!
I'm using Rails 4.2.1 and PostgreSQL 9.3
Using arel can get you pretty far. The tricky part is how do you not write your entire query using arel's own query syntax?
Here's a trick: when building your query using where, if you use arel conditions, you get some extra methods for free. For instance, you can tail the subquery you have there with .exists.not, which will get you a (NOT ( EXISTS (subquery))) Toss that into parent's where-clause and you're set.
The question is, how do you reference the tables involved? You need Arel for that. You could use Arel's where with its ugly conditions like a.eq b. But why? Since it's an equality condition, you can use Rails' conditions instead! You can reference the table you're quering with a hash key, but for the other table (in the outer query) you can use its arel_table. Watch this:
parents = Parent.arel_table
Parent.where(
Child.where(other_parent_id: nil, parent_id: parents[:id]).exists.not
)
You can even reduce Arel usage by resorting to strings a little and relying on the fact that you can feed in subqueries as parameters to Rails' where. There is not much use to it, but it doesn't force you to dig into Arel's methods too much, so you can use that trick or other SQL operators that take a subquery (are there even any others?):
parents = Parent.arel_table
Parent.where('NOT EXISTS (?)',
Child.where(parent_id: parents[:id], other_parent_id: nil)
)
The two main points here are:
You can build subqueries just the same way you are used to building regular queries, referencing the outer query's table with Arel. It may not even be a real table, it may be an alias! Crazy stuff.
You can use subqueries as parameters for Rails' where method just fine.
Using the exceptional scuttle, you can translate arbitrary SQL into ruby (ActiveRecord and ARel queries)
From that tool, your query converts to
Parent.select(Arel.star).where(
Child.select(Arel.star).where(
Child.arel_table[:parent_id].eq(Parent.arel_table[:id]).and(Child.arel_table[:other_parent_id].eq(nil))
).ast
)
Splitting up the query-
Parent.select(Arel.star) will query for all columns in the Parent table.
Child.arel_table brings you into Arel-world, allowing you a little bit more power in generating your query from ruby. Specifically, Child.arel_table[:parent_id] gives you a handle onto an Arel::Attributes::Attribute that you can continue to use while building a query.
the .eq and .and methods do exactly what you would expect, letting you build a query of arbitrary depth and complexity.
Not necessarily "cleaner", but entirely within ruby, which is nice.
Given Parent and Child, and child is in a belongs_to relationship with OtherParent (Rails defaults assumed):
Parent.joins(:childs).where('other_parent_id = ?', other_parent_id)

RoR/Squeel - How do I use Squeel::Nodes::Join/Predicates?

I just recently inherited a project where the previous developer used Squeel.
I've been studying Squeel for the past day now and know a bit about how to use it from what I could find online. The basic use of it is simple enough.
What I haven't been able to find online (except for on ruby-doc.org, which didn't give me much), is how to use Squeel::Nodes::Join and Squeel::Nodes::Predicate.
The only thing I've been able to find out is that they are nodes representing join associations / predicate expressions, which I had figured as much. What I still don't know is how to use them.
Can someone help me out or point me toward a good tutorial/guide?
I might as well answer this since I was able to figure out quite a bit through trial and error and by using ruby-doc as a guide. Everything I say here is not a final definition to each of these. It's just what I know that may be able to help someone out in the future in case anyone else is stuck making dynamic queries with Squeel.
Squeel::Nodes::Stub
Let's actually start with Squeel::Nodes::Stub. This is a Squeel object that can take either a symbol or a string and can convert it into the name of a table or column. So you can create a new Squeel::Nodes::Stube.new("value") or Squeel::Nodes::Stube.new(:value) and use this stub in other Squeel nodes. You'll see examples of it being used below.
Squeel::Nodes::Join
Squeel::Nodes::Join acts just like you might suspect. It is essentially a variable you can pass in to a Squeel joins{} that will then perform the join you want. You give it a stub (with a table name), and you can also give it another variable to change the type of join (I only know how to change it to outer join at the moment). You create one like so:
Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields), Arel::OuterJoin)
The stub is used to let the Join know we want to join the custom_fields table, and the Arel::OuterJoin is just to let the Join know we want to do an outer join. Again, you don't have to put a second parameter into Squeel::Nodes::Join.new(), and I think it will default to performing an inner join. You can then join this to a model:
Person.joins{Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields), Arel::OuterJoin)}
Squeel::Nodes::Predicate
Squeel::Nodes::Predicate may seem pretty obvious at this point. It's just a comparison. You give it a stub (with a column name), a method of comparison (you can find them all in the Predicates section on Squeel's github) and a value to compare with, like so:
Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value), :eq, 5)
You can even AND or OR two of them together pretty easily.
AND: Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) & Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)
OR: Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) | Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)
These will return either a Squeel::Nodes::And or a Squeel::Nodes::Or with the nested Squeel::Nodes::Predicates.
Then you can put it all together like this (of course you'd probably have the joins in a variable, a, and the predicates in a variable b, because you are doing this dynamically, otherwise you should probably be using regular Squeel instead of Squeel nodes):
Person.joins{Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields),
Arel::OuterJoin)}.where{Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) | Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)}
I unfortunately could not figure out how to do subqueries though :(

How do you use the postgresql WITH in activerecord?

In postgresql one can read data from a "with" I want to know how to make use of that in rails without putting the whole query in raw sql..
Here's a sample query: it is totally contrived for this question.
with tasks as (select 1 as score, tasks.* from tasks)
select 1 from tasks where id > 10 order by tasks.score, tasks.id
In my real example score is calculated not just 1 but for the example it works.
Here's how I imagine the code would look
Task.with('tasks as (select 1 as score, tasks.* from tasks)')
.where('id > 10')
.order(score)
.order(id)
I don't really like using the "with" because it is PG specific, but I really need to sort on the calculated value. I tried a view but the creation of the view in PG requires the exact fields, and I don't want other coders to have to alter the view when they alter the source table.
I do really want to be able to chain this.
I don't believe this is supported in pure ActiveRecord without dropping down to raw SQL.
However, there is an add-on, postgres_ext which, among other things, adds CTE support for use with ActiveRecord. I haven't used this add-on myself (I would prefer to drop down into raw SQL in this situation), but it looks like it would allow the chaining behavior you are looking for.
Once you've installed that, you'd want to use its from_cte method to define the CTE (the WITH statement), and then chain however you'd like to filter, sort, etc.
So I think the code would look something like this:
Task.from_cte('tasks', Task.where('id > 10')).order('tasks.score, tasks.id')
Note that the chaining starts after the from_cte call has completed on the ActiveRecord::Relation object that results.
Edit in response to comment from OP:
Well, you could add more to the chaining within the from_cte, I think -- it really depends on what you want the CTE to be. You could certainly also filter by user in the where method such that the CTE just contained a single user's tasks.

In Lucene/Solr what is the difference between Join and BlockJoin?

Join is described as pseudo-Join, because it's more equivalent to an SQL inner-query.
Whereas BlockJoin is described as more like a SQL join but requiring a sophisticated indexing schema, one that anticipates all the possible joins you'd want to make.
Could someone explain the difference between these features in terms of how to implement them at index time and query time. And what are the implications for performance?
I don't think blockjoinquery is a Solr function. I think its Lucene feature.
The solr join doesn't score documents in the from query and it doesn't return combined results. So its best used as a filter query. This will allow the main query.to score.
Block join on the other hand does use scoring and returns both results.( not 100% sure)
You can also use querytime join. This has serval scoring options. This is also a lucene feature but doesn't require special indexing blocks. I've used this in combination with a solr query parser plugin. The performance is a bit lower then blockjoin but it Works.
I have only used solr join and querytimejoin So I can't really say much about blockjoin.
As I understand, BlockJoin is for joining against nested/child documents within the same core. Join is for joining against a separate core.

Do ActiveRecord's statements cover the entire scope of what is possible in raw sql?

In other words, can any complex sql statement (non-db specific SQL code) be broken up into constituent ActiveRecord statements? For the sake of the argument, I am not considering performance or multiple calls to the database (which could of course be avoided with raw SQL).
No. While Active Record does most abstractions fairly well, some calls are database specific and cannot be abstracted like you mentioned. Others just simply cannot be represented. Something like the SQL CASE call is an example of code I couldn't reconstruct with Active Record. On a reasonably large dataset (~30000), looping was not possible as it took upwards of 20 seconds to run compared to the speed of SQL.
SELECT t.price_range AS price_range, count(*) as total FROM
(
SELECT
CASE
WHEN (price >= '0.00' AND price < '25.00') THEN '0-25'
WHEN (price >= '25.00' AND price < '50.00') THEN '25-50'
ELSE '50+'
END AS price_range
FROM products p
RIGHT JOIN
product_categories pc
ON
p.id = pc.id
) t group by t.price_range
I would suggest using the docs and some judgment to make an informed decision about when to use SQL.
Re: Do ActiveRecord's statements cover the entire scope of what is possible in raw sql?
Definitely not. Eg Union statements, etc. But there is a good work-around: create your complicated SQL as a SQL View. Then use ActiveRecord to access the View.
Depending on the dbms and the SQL for the View, the AR Model may or may not need to be marked as ReadOnly.

Resources