join same table several times efficiently with Rails syntax - ruby-on-rails

I have a complex DB schema in my Rails 2.3 app. At some point, I need to make a query to retrieve data from several columns by joining multiple tables. There is a table where I always need to go through to retrieve the data I need.
The problem comes when I try to use Rails syntax to make the joins. I found no way to make rails skip joining the same table again and again.
Let's say I have this table relationships:
A=>B=>C=>D=>E=>F
A=>B=>F=>G
A=>B=>F=>H
A=>I
As you can see, B is a table that I "join" several times to get to the tables I need.
My query looks something like this:
A.all( :select=>"SOME DATA FROM F, G, H AND I",
:joins=>[{{{{:B=>:C}=>:D}=>:E}=>:F}, {{:B=>:F}=>:G}, {{:B=>:F}=>:H}, :I],
:conditions=>{"SOME CONDITIONS"
}
)
You can see that I use the hash syntax to specify the joins.
The problem is that when I look at the join that rails create, I see that it joins B 3 times. I would expect that it would be smart enough to do it just just once and go from there but I guess that there maybe cases where you want to join several times.
My question is, how can I, using rails 2.3.8 syntax, make so that the resulting query just joins B once. Adding extra relationships in the model of type :belongs_to :throug is not an option for me

try
A.all(:joins => [:I, { :B => [{ :F => [:G, :H] }, {:C => { :D => { :E => :F }}}]}])

Related

Using symbols to map after plucking from joined tables with same column names

Currently, I have joined multiple tables using the join method, and I need to pluck out several columns, which I need to map into something else. Here's what I mean:
A.join( ... long sql statements involving model B and C...)
.pluck("A.id", "A.name", "B.id", "B.name" ...) # you get the idea
.map( |result|
# Then to use the various attributes, I was using result[0] to access A.id and so on
I was wondering is it possible to convert my attributes in my pluck to symbols like :A_id or :B_name? The reason I have to use "table_name.attribute" is due to the tables having columsn with the same name. If possible I was looking for:
A.join( ... long sql statements involving model B and C...)
.pluck(A_id, A_name, B.id, B_name ...)
.map( |A_id, A_name, B.id, B_name ...| ...)
Symbols would make it easier so that when I map, I do not need to use indexing in order to access my attributes? For example, I can straight up use :A_id instead of result[0] in the above example.
Would really help with readability since I'm plucking quite a lot of attributes and my join is pretty big (so there's plenty of columns with the same name), and it definitely looks messy with result[0] to result[10] all over in my map function.
Thanks in advance!!
.pluck returns an array and it's a bit difficult to work with for your scenario, but a combination of select and AS (alias_name) does the trick.
A.join( ... long sql statements involving model B and C...)
.select("A.id AS AID", "A.name AS ANAME", "B.id AS BID", "B.name AS BNAME" ...)
.each { |result| p result.AID ...}
The difference between pluck and select, is that select returns an array of A objects having those aliases defined in select as attributes.

Rails: select unique values from a column

I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")

ActiveRecord find and only return selected columns

edit 2
If you stumble across this, check both answers as I'd now use pluck for this
I have a fairly large custom dataset that I'd like to return to be echoe'd out as json. One part is:
l=Location.find(row.id)
tmp[row.id]=l
but I'd like to do something like:
l=Location.find(row.id).select("name, website, city")
tmp[row.id]=l
but this doesn't seem to be working. How would I get this to work?
thx
edit 1
alternatively, is there a way that I can pass an array of only the attributes I want included?
pluck(column_name)
This method is designed to perform select by a single column as direct SQL query Returns Array with values of the specified column name The values has same data type as column.
Examples:
Person.pluck(:id) # SELECT people.id FROM people
Person.uniq.pluck(:role) # SELECT DISTINCT role FROM people
Person.where(:confirmed => true).limit(5).pluck(:id)
see http://api.rubyonrails.org/classes/ActiveRecord/Calculations.html#method-i-pluck
Its introduced rails 3.2 onwards and accepts only single column. In rails 4, it accepts multiple columns
In Rails 2
l = Location.find(:id => id, :select => "name, website, city", :limit => 1)
...or...
l = Location.find_by_sql(:conditions => ["SELECT name, website, city FROM locations WHERE id = ? LIMIT 1", id])
This reference doc gives you the entire list of options you can use with .find, including how to limit by number, id, or any other arbitrary column/constraint.
In Rails 3 w/ActiveRecord Query Interface
l = Location.where(["id = ?", id]).select("name, website, city").first
Ref: Active Record Query Interface
You can also swap the order of these chained calls, doing .select(...).where(...).first - all these calls do is construct the SQL query and then send it off.
My answer comes quite late because I'm a pretty new developer. This is what you can do:
Location.select(:name, :website, :city).find(row.id)
Btw, this is Rails 4

Rails 2.3.8: Fetching objects via ActiveRecord without building all the objects

I'm wondering if there's a way to fetch objects from the DB via ActiveRecord, without having Rails build the whole objects (just a few fields).
For example,
I sometimes need to check whether a certain object contains a certain field.
Let's say I have a Student object referencing a Bag object (each student has one bag).
I need to check if a female student exists that her bag has more than 4 pencils.
In ActiveRecord, I would have to do something like this:
exists = Student.female.find(:all, conditions => 'bags.pencil_count > 4', :include => :bag).size > 0
The problem is that if there are a 1000 students complying with this condition,
a 1000 objects would be built by AR including their 1000 bags.
This reduces me to using plain SQL for this query (for performance reasons), which breaks the AR.
I won't be using the named scopes, and I would have to remember to update them all around the code,
if something in the named scope changes.
This is an example, but there are many more cases that for performance reasons,
I would have to use SQL instead of letting AR build many objects,
and this breaks encapsulation.
Is there any way to tell AR not to build the objects, or just build a certain field (also in associations)?
If you're only testing for the existence of a matching record, just use Model.count from ActiveRecord::Calculations, e.g.:
exists = Student.female.count( :conditions => 'bags.pencil_count > 4',
:joins => :bag
) > 0
count simply (as the name of the class implies), does the calculation and doesn't build any objects.
(And for future reference it's good to know the difference between :include and :joins. The former eager-loads the associated model, whereas the latter does not, but still lets you use those fields in your :conditions.)
Jordan gave the best answer here - especially re: using joins instead of include (because join won't actually create the bag objects)
I'll just add to it by saying that if you do actually still need the "Student" objects (just with the small amount of info on it) you can also use the :select keyword - which works just like in mysql and means the db I/O will be reduced to just the info you put in the select - and you can also add derived fields form the other tables eg:
students = Student.female.all(
:select => 'students.id, students.name, bags.pencil_count AS pencil_count',
:conditions => 'students.gender = 'F' AND bags.pencil_count > 4',
:joins => :bag
)
students.each do |student|
p "#{student.name} has #{student.pencil_count} pencils in her bag"
end
would give eg:
Jenny has 5 pencils in her bag
Samantha has 14 pencils in her bag
Jill has 8 pencils in her bag
(though note that a derived field (eg pencil_count) will be a string - you may need to cast eg with student.pencil_count.to_i )

Ruby/Rails Collection to Collection

I have a two tables joined with a join table - this is just pseudo code:
Library
Book
LibraryBooks
What I need to do is if i have the id of a library, i want to get all the libraries that all the books that this library has are in.
So if i have Library 1, and Library 1 has books A and B in them, and books A and B are in Libraries 1, 2, and 3, is there an elegant (one line) way todo this in rails?
I was thinking:
l = Library.find(1)
allLibraries = l.books.libraries
But that doesn't seem to work. Suggestions?
l = Library.find(:all, :include => :books)
l.books.map { |b| b.library_ids }.flatten.uniq
Note that map(&:library_ids) is slower than map { |b| b.library_ids } in Ruby 1.8.6, and faster in 1.9.0.
I should also mention that if you used :joins instead of include there, it would find the library and related books all in the same query speeding up the database time. :joins will only work however if a library has books.
Perhaps:
l.books.map {|b| b.libraries}
or
l.books.map {|b| b.libraries}.flatten.uniq
if you want it all in a flat array.
Of course, you should really define this as a method on Library, so as to uphold the noble cause of encapsulation.
If you want a one-dimensional array of libraries returned, with duplicates removed.
l.books.map{|b| b.libraries}.flatten.uniq
One problem with
l.books.map{|b| b.libraries}.flatten.uniq
is that it will generate one SQL call for each book in l. A better approach (assuming I understand your schema) might be:
LibraryBook.find(:all, :conditions => ['book_id IN (?)', l.book_ids]).map(&:library_id).uniq

Resources