Propel query with nested statements and empty field value - join

I have quite a complex SQL query which I would like to transform into Propel but I am not sure about the best approach.
The query I need looks like this:
SELECT id_loan
FROM loan loanA
JOIN loan_funding on fk_loan = loanA.id_loan
JOIN `user` userA on loan_funding.fk_user = userA.id_user
WHERE
userA.`acc_internal_account_id` is not null
AND loanA.`state` = 'payment_origination'
AND loanA.id_loan IN (
SELECT id_loan from loan loanB
JOIN loan_funding on fk_loan = id_loan
JOIN `user` userB on loan_funding.fk_user = userB.id_user
WHERE
userB.`acc_internal_account_id` is null
AND loanB.`state` = 'payment_origination'
GROUP BY loanB.id_loan
)
GROUP BY loanA.id_loan
LIMIT 1;
What I would like to have is something completely based on the Generated Query Methods but I do not quite get how to do it.
Performance is not an issue but as for now it is unclear where and how those queries will be called from. However, it is important to get back an object as we need to use the getters and setters.
I found this website: http://propelorm.org/blog/2011/02/02/how-can-i-write-this-query-using-an-orm-.html which looks really cool and helpful, however, I am not sure what option fits best here.
I do not expect a complete solution but maybe some thoughts how to narrow down the problem...
What confuses me is especially the part where it compares the id_loan and fk_loan before it goes to the user table. How would this relationship be represented by propel? Might it be better to split the whole thing in multiple queries?
Any hints appreciated!

Related

How do I join multiple hive queries?

I am trying to join a simple query with a very ugly query that resolves to a single line. They have a date and a userid in common but nothing else. Alone both queries work but for the life of me I cannot get them to work together. Can someone assist me in how I would do this?
Fixed it...when you union queries in hive it looks like you need to have an equal number of fields coming back from each.

ActiveRecord .joins breaking other queries

I'm writing a Rails API on top of a legacy database with tons of tables. The search feature gives users the ability to query around 20 separate columns spread across 13 tables. I have a number of queries that check the params to see if they need to return results. They look like this:
results << Company.where('city LIKE ?', "#{params[:city]}").select('id') unless params[:city].blank?
and they work fine. However, I just added another query that looks like this:
results << Company.joins("JOIN Contact ON Contact.company_id = Company.id").where("Contact.first_name LIKE ?", "%#{params[:first_name]}%").select('company_id') unless params[:first_name].blank?
and suddenly my first set of queries started returning null, rather than the list of IDs they had been returning. The query with the join works perfectly well whether the other queries are functional or not. When I comment the join query out, the previous queries start working again. Is there some reason the query with a join would break other queries on the page?
I can't think of a particular reason why the join would be breaking your previous queries however I do have some suggestions for your query overall.
Assuming you've modelled these relationships correctly you shouldn't need to define the join manually. On another note, you're not querying against the company at all so you can use an includes instead of a join - this will allow you to access its data without firing another query.
If you wanted to access company data (ie. query.company.name) use an includes like so:
Contact.includes(:company).where('first_name LIKE ?', param).select(:company_id).distinct
However it appears all you really want is an array of ID's (which exists on the contact model), because of this you can lighten things up and not include the company at all.
Contact.where('first_name LIKE ?', param).select(:company_id).distinct
Whenever you get stuck never forget to checkout the great resources over at: http://api.rubyonrails.org/ - they are an absolute life saver sometimes!
It turned out that the queries with a join needed to be placed above the queries without a join. I'm not sure why it behaves this way, but hopefully this helps someone else down the line.

RoR/Squeel - How do I use Squeel::Nodes::Join/Predicates?

I just recently inherited a project where the previous developer used Squeel.
I've been studying Squeel for the past day now and know a bit about how to use it from what I could find online. The basic use of it is simple enough.
What I haven't been able to find online (except for on ruby-doc.org, which didn't give me much), is how to use Squeel::Nodes::Join and Squeel::Nodes::Predicate.
The only thing I've been able to find out is that they are nodes representing join associations / predicate expressions, which I had figured as much. What I still don't know is how to use them.
Can someone help me out or point me toward a good tutorial/guide?
I might as well answer this since I was able to figure out quite a bit through trial and error and by using ruby-doc as a guide. Everything I say here is not a final definition to each of these. It's just what I know that may be able to help someone out in the future in case anyone else is stuck making dynamic queries with Squeel.
Squeel::Nodes::Stub
Let's actually start with Squeel::Nodes::Stub. This is a Squeel object that can take either a symbol or a string and can convert it into the name of a table or column. So you can create a new Squeel::Nodes::Stube.new("value") or Squeel::Nodes::Stube.new(:value) and use this stub in other Squeel nodes. You'll see examples of it being used below.
Squeel::Nodes::Join
Squeel::Nodes::Join acts just like you might suspect. It is essentially a variable you can pass in to a Squeel joins{} that will then perform the join you want. You give it a stub (with a table name), and you can also give it another variable to change the type of join (I only know how to change it to outer join at the moment). You create one like so:
Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields), Arel::OuterJoin)
The stub is used to let the Join know we want to join the custom_fields table, and the Arel::OuterJoin is just to let the Join know we want to do an outer join. Again, you don't have to put a second parameter into Squeel::Nodes::Join.new(), and I think it will default to performing an inner join. You can then join this to a model:
Person.joins{Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields), Arel::OuterJoin)}
Squeel::Nodes::Predicate
Squeel::Nodes::Predicate may seem pretty obvious at this point. It's just a comparison. You give it a stub (with a column name), a method of comparison (you can find them all in the Predicates section on Squeel's github) and a value to compare with, like so:
Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value), :eq, 5)
You can even AND or OR two of them together pretty easily.
AND: Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) & Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)
OR: Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) | Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)
These will return either a Squeel::Nodes::And or a Squeel::Nodes::Or with the nested Squeel::Nodes::Predicates.
Then you can put it all together like this (of course you'd probably have the joins in a variable, a, and the predicates in a variable b, because you are doing this dynamically, otherwise you should probably be using regular Squeel instead of Squeel nodes):
Person.joins{Squeel::Nodes::Join.new(Squeel::Nodes::Stub.new(:custom_fields),
Arel::OuterJoin)}.where{Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value1), :eq, 5) | Squeel::Nodes::Predicate.new(Squeel::Nodes::Stub(:value2), :eq, 10)}
I unfortunately could not figure out how to do subqueries though :(

Chain of multiple where Active Record clauses not working

I am fairly new to Ruby on Rails and ActiveRecord. I have a database model named location which describes a point of interest on a map. A location has a location_type field that can have three different location types (business, dispensary or contact). A location also has an owner_id as well which is the user_id of the user who created the location.
In the controller the user requests all of their locations by providing their ID. The dispensary and business locations are public so all users should be able to view them, while the contacts should only be shown to the user who is the owner of them. Therefore I am tasked with creating an ActiveRecord query that returns all dispensaries and businesses in the database and all contacts that were created by that user. I was trying to do this by chaining together where clauses but for some reason this has failed:
#locations = Location.where(:location_type => ["business", "dispensary"]).where(:location_type => "contact", :owner_id => params[:id])
Which generates this PostgreSQL:
SELECT "locations".* FROM "locations" WHERE "locations"."location_type" IN ('business', 'dispensary') AND "locations"."location_type" = 'contact' AND "locations"."owner_id" = 1
I suspect that this failed because the first where returns just the locations of type business and dispensary and the second where queries that returned data which has no locations of type contact within it. How can I query for all dispensaries and businesses combined with a set of filtered contacts?
Chaining where calls like that will result in ANDs at the SQL level. where can take raw SQL for an argument, in which you could explicitly add an OR, but parameterizing it properly is rather messy IMO (although it can be done). So, for this type of query, I think it would probably be best to drop down into using raw SQL with sanitized inputs (to guard against SQL injection).
i.e. something like this:
x = ActiveRecord::Base.connection.raw_connection.prepare(
"SELECT * FROM locations
WHERE location_type IN ('business', 'dispensary') OR
(location_type = 'contact' AND owner_id = ?)")
x.execute(params[:id])
x.close
This will select all items from the locations table where the location_type is either 'business' or 'dispensary', regardless of the owner_id, and all items where the location_typeis 'contact' where theowner_id` matches the one passed in.
Edit in response to comment from OP:
I tend to prefer raw SQL whenever possible for more complex queries, as I find it easier to control the behavior (ORMs can sometimes do things that are less than desirable, such as executing the same query 1000 times to get 1000 entries instead of one SQL query once, resulting in terrible performance), however, if you'd prefer stay within the bounds of ActiveRecord, you can use the form of where that takes arguments. It'll be somewhat raw SQL, in that you need to specify the where clause yourself, but you won't need to get a raw_connection and explicitly execute -- it'll work within the framework of the ActiveRecord query you were doing.
So, that would look something like this:
#locations = Location.where("location_type IN ('business', 'dispensary') OR
(location_type = 'contact' AND owner_id = ?)", params[:id])
See this Active Record guide page for more info, section 2.2.
Edit in response to follow-up question from OP:
Regarding the ? in the SQL, you can think of it as a placeholder of sorts (there's really no formatting to be done with it, but rather signifies a parameter goes there).
The reason it's important is that when a ? is placed in the query and then the actual value you want to use is passed as an argument to where (and certain other functions as well), the underlying SQL driver will interpolate the parameter into the query in such a way that prevents SQL injection, which could allow for all kinds of different problems. If you were to instead do the interpolation yourself directly into the query string, you would still be potentially susceptible to SQL injection. So not only is ? safe from SQLI, it's specifically intended to prevent it.
You can have a bunch of ? in your query, as long as you pass the corresponding number of parameters as arguments after the query string (otherwise the SQL driver should error out).

Performance of generated T-SQL from Entity Framework

I recently used Entity Framework for a project, despite my DBA's strong disapproval. So one day he came to my office complaining about generated T-SQL that reaches his database.
For instance, when I want to select a product based on the id, I write something like this:
context.Products.FirstOrDefault(p=>p.Id==id);
Which translates to
SELECT ... FROM (SELECT TOP 1 ... FROM PRODUCTS WHERE ID=#id)
So he is shouting, "Why on earth would you write a SELECT * FROM (SELECT TOP 1)"
So I changed my code to
context.Products.Where(p=>p.Id==id).ToList().FirstOrDefault()
and this produces a much cleaner T-SQL:
SELECT ... FROM PRODUCTS WHERE ID=#id
The inner query and the TOP 1 dissappeared. Enough mambling, my question is this: Does the first query really put an overhead for SQL Server? Is it harder to parse than the second method? The Id column has a Clustered index on. I want a good answer so I can rub it on his face (or mine)
Thanks,
Themos
Have you tried running the queries manually and comparing the executions plans?
The biggest problem here isn't that the SQL isn't perfectly formed to your DBA's standards (although I'm fairly certain that the query engine will optimize out the extra select). The second query actually returns the entire contents of the Products table which you then analyse in memory and this is definitely a task that should be performed by the DB and not the application layer.
In short, he's being a pedant; leave it the way it was.

Resources