RoR/Squeel - How do I use Squeel::Nodes::Join/Predicates? - ruby-on-rails

I just recently inherited a project where the previous developer used Squeel.
I've been studying Squeel for the past day now and know a bit about how to use it from what I could find online. The basic use of it is simple enough.
What I haven't been able to find online (except for on ruby-doc.org, which didn't give me much), is how to use Squeel::Nodes::Join and Squeel::Nodes::Predicate.
The only thing I've been able to find out is that they are nodes representing join associations / predicate expressions, which I had figured as much. What I still don't know is how to use them.
Can someone help me out or point me toward a good tutorial/guide?

I might as well answer this since I was able to figure out quite a bit through trial and error and by using ruby-doc as a guide. Everything I say here is not a final definition to each of these. It's just what I know that may be able to help someone out in the future in case anyone else is stuck making dynamic queries with Squeel.
Squeel::Nodes::Stub
Let's actually start with Squeel::Nodes::Stub. This is a Squeel object that can take either a symbol or a string and can convert it into the name of a table or column. So you can create a new Squeel::Nodes::Stube.new("value") or Squeel::Nodes::Stube.new(:value) and use this stub in other Squeel nodes. You'll see examples of it being used below.
Squeel::Nodes::Join
Squeel::Nodes::Join acts just like you might suspect. It is essentially a variable you can pass in to a Squeel joins{} that will then perform the join you want. You give it a stub (with a table name), and you can also give it another variable to change the type of join (I only know how to change it to outer join at the moment). You create one like so:
Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields), Arel::OuterJoin)
The stub is used to let the Join know we want to join the custom_fields table, and the Arel::OuterJoin is just to let the Join know we want to do an outer join. Again, you don't have to put a second parameter into Squeel::Nodes::Join.new(), and I think it will default to performing an inner join. You can then join this to a model:
Person.joins{Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields), Arel::OuterJoin)}
Squeel::Nodes::Predicate
Squeel::Nodes::Predicate may seem pretty obvious at this point. It's just a comparison. You give it a stub (with a column name), a method of comparison (you can find them all in the Predicates section on Squeel's github) and a value to compare with, like so:
Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value), :eq, 5)
You can even AND or OR two of them together pretty easily.
AND: Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) & Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)
OR: Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) | Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)
These will return either a Squeel::Nodes::And or a Squeel::Nodes::Or with the nested Squeel::Nodes::Predicates.
Then you can put it all together like this (of course you'd probably have the joins in a variable, a, and the predicates in a variable b, because you are doing this dynamically, otherwise you should probably be using regular Squeel instead of Squeel nodes):
Person.joins{Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields),
Arel::OuterJoin)}.where{Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) | Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)}
I unfortunately could not figure out how to do subqueries though :(

Related

Is there any diferrence between includes(:associations).references(:associations) and eager_load(:associations) in Ruby on Rails 5?

It seems includes(:associations).references(:associations) and eager_load(:associations) execute exactly the same SQL (LEFT OUTER JOIN) in Rails 5. So when do I need to use includes(:associations).references(:associations) syntax?
For example,
Parent.includes(:children1, :children2).references(:children1).where('(some conditions of children1)')
can be converted to
Parent.eager_load(:children1).preload(:children2).where('(some conditions of children1)')
I think the latter (query using eager_load and preload) is simpler and looks better.
UPDATE
I found a strange behavior in my environment (rails 5.2.4.3).
Even when I includes several associations and references only one of them, all the associations I included are LEFT OUTER JOINed.
For example,
Parent.includes(:c1, :c2, :c3).references(:c1).to_sql
executes a SQL which LEFT OUTER JOINs all of c1, c2, c3.
I thought it joins only c1.
Indeed, includes + references ends up being the same as eager_load. Like many things in Rails, you have a few ways of accomplishing the same result, and here you are witnessing that first hand. If I were writing them in a single statement, I would always prefer eager_load because it is more explicit and it is a single function call.
I would also prefer eager_load because I consider references a kind of hack. It says to the SQL generator "Hey, I am referring to this object in a way that you would not otherwise detect, so include it in the JOIN statement" and is generally used when you use a String to pass a SQL fragment as part of the query.
The only time I would use the includes(:associations).references(:associations) syntax is when it was an artifact needed to make the query work and not a statement of intent. The Rails Guide gives this good example:
Article.includes(:comments).where("comments.visible = true").references(:comments)
As for why referencing 1 association causes 3 to be JOIN'ed, I do not know for sure. The code for includes uses heuristics to decide when it is faster to use a JOIN and when it is faster to use 2 separate queries, the first to retrieve the parent and the second to retrieve the associated objects. I was surprised to find how often it is faster to use 2 separate queries. It may be that since the query has to use 1 join anyway, the algorithm figures it will be faster to use 1 big join rather than 3 queries, or it may in general think 1 join is faster than 4 queries.
I would not in general use preload unless I had a strong reason to believe that it was faster than join. I would just use includes alone and let the algorithm decide.

ActiveRecord: How to find parents whose ALL children match a condition?

Suppose I have a Parent model that has many Child, and that Child also belongs to OtherParent.
How can i find all Parent where ALL of its Child belongs to any OtherParent?
In pure SQL I could do
Parent.find_by_sql(<<SQL)
SELECT *
FROM parents p
WHERE NOT EXISTS (
SELECT *
FROM children
WHERE parent_id = p.id
AND other_parent_id IS NULL
)
SQL
(from here), but I'd prefer to do it by taking advantage of ActiveRecord if possible.
Thanks!
I'm using Rails 4.2.1 and PostgreSQL 9.3
Using arel can get you pretty far. The tricky part is how do you not write your entire query using arel's own query syntax?
Here's a trick: when building your query using where, if you use arel conditions, you get some extra methods for free. For instance, you can tail the subquery you have there with .exists.not, which will get you a (NOT ( EXISTS (subquery))) Toss that into parent's where-clause and you're set.
The question is, how do you reference the tables involved? You need Arel for that. You could use Arel's where with its ugly conditions like a.eq b. But why? Since it's an equality condition, you can use Rails' conditions instead! You can reference the table you're quering with a hash key, but for the other table (in the outer query) you can use its arel_table. Watch this:
parents = Parent.arel_table
Parent.where(
Child.where(other_parent_id: nil, parent_id: parents[:id]).exists.not
)
You can even reduce Arel usage by resorting to strings a little and relying on the fact that you can feed in subqueries as parameters to Rails' where. There is not much use to it, but it doesn't force you to dig into Arel's methods too much, so you can use that trick or other SQL operators that take a subquery (are there even any others?):
parents = Parent.arel_table
Parent.where('NOT EXISTS (?)',
Child.where(parent_id: parents[:id], other_parent_id: nil)
)
The two main points here are:
You can build subqueries just the same way you are used to building regular queries, referencing the outer query's table with Arel. It may not even be a real table, it may be an alias! Crazy stuff.
You can use subqueries as parameters for Rails' where method just fine.
Using the exceptional scuttle, you can translate arbitrary SQL into ruby (ActiveRecord and ARel queries)
From that tool, your query converts to
Parent.select(Arel.star).where(
Child.select(Arel.star).where(
Child.arel_table[:parent_id].eq(Parent.arel_table[:id]).and(Child.arel_table[:other_parent_id].eq(nil))
).ast
)
Splitting up the query-
Parent.select(Arel.star) will query for all columns in the Parent table.
Child.arel_table brings you into Arel-world, allowing you a little bit more power in generating your query from ruby. Specifically, Child.arel_table[:parent_id] gives you a handle onto an Arel::Attributes::Attribute that you can continue to use while building a query.
the .eq and .and methods do exactly what you would expect, letting you build a query of arbitrary depth and complexity.
Not necessarily "cleaner", but entirely within ruby, which is nice.
Given Parent and Child, and child is in a belongs_to relationship with OtherParent (Rails defaults assumed):
Parent.joins(:childs).where('other_parent_id = ?', other_parent_id)

ActiveRecord .joins breaking other queries

I'm writing a Rails API on top of a legacy database with tons of tables. The search feature gives users the ability to query around 20 separate columns spread across 13 tables. I have a number of queries that check the params to see if they need to return results. They look like this:
results << Company.where('city LIKE ?', "#{params[:city]}").select('id') unless params[:city].blank?
and they work fine. However, I just added another query that looks like this:
results << Company.joins("JOIN Contact ON Contact.company_id = Company.id").where("Contact.first_name LIKE ?", "%#{params[:first_name]}%").select('company_id') unless params[:first_name].blank?
and suddenly my first set of queries started returning null, rather than the list of IDs they had been returning. The query with the join works perfectly well whether the other queries are functional or not. When I comment the join query out, the previous queries start working again. Is there some reason the query with a join would break other queries on the page?
I can't think of a particular reason why the join would be breaking your previous queries however I do have some suggestions for your query overall.
Assuming you've modelled these relationships correctly you shouldn't need to define the join manually. On another note, you're not querying against the company at all so you can use an includes instead of a join - this will allow you to access its data without firing another query.
If you wanted to access company data (ie. query.company.name) use an includes like so:
Contact.includes(:company).where('first_name LIKE ?', param).select(:company_id).distinct
However it appears all you really want is an array of ID's (which exists on the contact model), because of this you can lighten things up and not include the company at all.
Contact.where('first_name LIKE ?', param).select(:company_id).distinct
Whenever you get stuck never forget to checkout the great resources over at: http://api.rubyonrails.org/ - they are an absolute life saver sometimes!
It turned out that the queries with a join needed to be placed above the queries without a join. I'm not sure why it behaves this way, but hopefully this helps someone else down the line.

How do you use the postgresql WITH in activerecord?

In postgresql one can read data from a "with" I want to know how to make use of that in rails without putting the whole query in raw sql..
Here's a sample query: it is totally contrived for this question.
with tasks as (select 1 as score, tasks.* from tasks)
select 1 from tasks where id > 10 order by tasks.score, tasks.id
In my real example score is calculated not just 1 but for the example it works.
Here's how I imagine the code would look
Task.with('tasks as (select 1 as score, tasks.* from tasks)')
.where('id > 10')
.order(score)
.order(id)
I don't really like using the "with" because it is PG specific, but I really need to sort on the calculated value. I tried a view but the creation of the view in PG requires the exact fields, and I don't want other coders to have to alter the view when they alter the source table.
I do really want to be able to chain this.
I don't believe this is supported in pure ActiveRecord without dropping down to raw SQL.
However, there is an add-on, postgres_ext which, among other things, adds CTE support for use with ActiveRecord. I haven't used this add-on myself (I would prefer to drop down into raw SQL in this situation), but it looks like it would allow the chaining behavior you are looking for.
Once you've installed that, you'd want to use its from_cte method to define the CTE (the WITH statement), and then chain however you'd like to filter, sort, etc.
So I think the code would look something like this:
Task.from_cte('tasks', Task.where('id > 10')).order('tasks.score, tasks.id')
Note that the chaining starts after the from_cte call has completed on the ActiveRecord::Relation object that results.
Edit in response to comment from OP:
Well, you could add more to the chaining within the from_cte, I think -- it really depends on what you want the CTE to be. You could certainly also filter by user in the where method such that the CTE just contained a single user's tasks.

Rails query syntax

I'm kinda new on the Rails boat, I would like to know the difference between two types of syntax for queries
The first one I tried is:
User.limit(8).order('created_at DESC').group('created_at').count
The second, which seems to be far more efficient and powerful:
User.count(:order =>'DATE(created_at) DESC', :group =>["DATE(created_at)"], :limit => 8)
But I don't really understand the use case for both.
I'm sure this is something obvious anyway...
Thanks!
The first one is rails 3 syntax. And each method used there, i.e, limit, order, group are ActiveRecord:: Relation method. There are various advantages in using the 1st method. ActiveRecord::Relation is one of the core features of rails 3 apart from asset pipeline etc.
Please read this,
http://asciicasts.com/episodes/239-activerecord-relation-walkthrough
Well the second syntax is the deprecated, old-school, syntax. Also known as Hash-Options-Overload. The first, chaining, syntax is the way forward.
The second one is the powerful and efficient. Because first one will take all the rows (and all the columns) and then it will count. But the second one will perform only counting the rows.
Second one will use the following query.
select count(*) from tablename;

Resources