does x = User.all create a hash? How do I traverse it? - ruby-on-rails

Let's say I have a User table and a Messages table, they have a has_many belongs_to relationship. I want to find the id: for users who's names are "Bob", then pull the message history for one of the id's.
x = User.where(name: "Bob")
Does that create a hash in variable x, with all the results of users whose names were Bob? The result in the console certainly looks like a hash when I run x. To includes the messages tied to all the Bobs, I think I do:
x = User.where(name: "Bob").includes(:messages)
Now that I have x...how do I find the id's of the people whose names are Bob? I don't want to query the db again, I'd like to do it all via the variable, is that possible?
I then want to get the first message of the first id (the first Bob) in my table. Can that be done via the variable, or do I have to go back to the DB once I have the first id?
Thanks for all the help guys and gals!

Most ActiveRecord queries return a Relation.
You can call x = x.to_a to make rails perform the actual query(there will be 2 SQL queries - one for users and one for messages) and then traverse the resulting array.

This will do it. As referenced in the rails guides. http://guides.rubyonrails.org/active_record_querying.html section 13.2
x = Message.includes(:users).where(users: { name: "Bob"})
and then to get the first message just tack on .first at the end of the query.
x = Message.includes(:users).where(users: { name: "Bob"}).first

You need to query from Message, not User. Joins (inner join) and includes (left outer join) can be used for eager loading, like in your question, or to do query across multiple tables.
Message.joins(:user).where('user.name = "bob"')

Related

Rails 5 ActiveRecord optional inclusive where for nested association's attribute

Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")

Ruby on Rails / ActiveRecord: How Can I (Elegantly) Retrieve Data from Multiple Tables?

It's rather trivial to retrieve data from multiple tables that are related through foreign keys using raw SQL. I can do, for example:
SELECT title, domestic_sales
FROM movies
JOIN boxoffice
ON movies.id = boxoffice.movie_id;
This would give me a table with two colums: title and domestic_sales, where the data in the first column comes from the table movies and the data in the second column comes from the table boxoffice.
How can I do this in Rails using Ruby code? I can, of course, get the same result if I use raw SQL. So, I could do the following:
ActiveRecord::Base.connection.execute(<<-SQL)
SELECT title, domestic_sales
FROM movies
JOIN boxoffice
ON movies.id = boxoffice.movie_id;
SQL
This would give me a PG::Result object with the data I want. But this is super inelegant. I would like to be able to get this information without using raw SQL.
So, this is the first thing that comes to mind is:
Movie.select(:name, :domestic_sales).joins(:box_office)
The problem, however, is that the aforementioned line of code returns a bunch of Movie objects. Since the Movie class doesn't have the domestic_sales attribute, I don't get access to that information.
The next thing I thought was to use a loop. So, I could do something like:
Movie.joins(:box_office).to_a.map do |m|
{name: m.name, rating: m.box_office.domestic_sales}
end
This gives me exactly the data I want. But it costs n + 1 SQL queries, which is not good. I should be able to get this with just one query...
So: How can I retrieve the data I want without using raw SQL and without using loops that cost multiple queries?
SELECT title, domestic_sales
FROM movies
JOIN boxoffice
ON movies.id = boxoffice.movie_id;
translated to ActiveRecord would look like this
Movie
.select(:title, :domestice_sales)
.joins("boxoffice ON movies.id = boxoffice.movie_id")
When you have proper associations defined in your models you would would be able to write:
Movie
.select(:title, :domestice_sales)
.joins(:boxoffices)
And when you do not need an instance of ActiveRecord and would be fine with a nested array, you can even write:
Movie
.joins(:boxoffices)
.pluck(:title, :domestice_sales)
Try this way.
Movie.joins(:box_office).pluck(:title, :domestic_sales)

How to make ActiveRecord query unique by a column

I have a Company model that has many Disclosures. The Disclosure has columns named title, pdf and pdf_sha256.
class Company < ActiveRecord::Base
has_many :disclosures
end
class Disclosure < ActiveRecord::Base
belongs_to :company
end
I want to make it unique by pdf_sha256 and if pdf_sha256 is nil that should be treated as unique.
If it is an Array, I'll write like this.
companies_with_sha256 = company.disclosures.where.not(pdf_sha256: nil).group_by(&:pdf_sha256).map do |key,values|
values.max_by{|v| v.title.length}
end
companies_without_sha256 = company.disclosures.where(pdf_sha256: nil)
companies = companies_with_sha256 + companeis_without_sha256
How can I get the same result by using ActiveRecord query?
It is possible to do it in one query by first getting a different id for each different pdf_sha256 as a subquery, then in the query getting the elements within that set of ids by passing the subquery as follows:
def unique_disclosures_by_pdf_sha256(company)
subquery = company.disclosures.select('MIN(id) as id').group(:pdf_sha256)
company.disclosures.where(id: subquery)
.or(company.disclosures.where(pdf_sha256: nil))
end
The great thing about this is that ActiveRecord is lazy loaded, so the first subquery will not be run and will be merged to the second main query to create a single query in the database. It will then retrieve all the disclosures unique by pdf_sha256 plus all the ones that have pdf_sha256 set to nil.
In case you are curious, given a company, the resulting query will be something like:
SELECT "disclosures".* FROM "disclosures"
WHERE (
"disclosures"."company_id" = $1 AND "disclosures"."id" IN (
SELECT MAX(id) as id FROM "disclosures" WHERE "disclosures"."company_id" = $2 GROUP BY "disclosures"."pdf_sha256"
)
OR "disclosures"."company_id" = $3 AND "disclosures"."pdf_sha256" IS NULL
)
The great thing about this solution is that the returned value is an ActiveRecord query, so it won't be loaded until you actually need. You can also use it to keep chaining queries. Example, you can select only the id instead of the whole model and limit the number of results returned by the database:
unique_disclosures_by_pdf_sha256(company).select(:id).limit(10).each { |d| puts d }
You can achieve this by using uniq method
Company.first.disclosures.to_a.uniq(&:pdf_sha256)
This will return you the disclosures records uniq by cloumn "pdf_sha256"
Hope this helps you! Cheers
Assuming you are using Rails 5 you could chain a .or command to merge both your queries.
pdf_sha256_unique_disclosures = company.disclosures.where(pdf_sha256: nil).or(company.disclosures.where.not(pdf_sha256: nil))
Then you can proceed with your group_by logic.
However, in the example above i'm not exactly sure what is the objective but I am curious to better understand how you would use the resulting companies variable.
If you wanted to have a hash of unique pdf_sha256 keys including nil, and its resultant unique disclosure document you could try the following:
sorted_disclosures = company.disclosures.group_by(&:pdf_sha256).each_with_object({}) do |entries, hash|
hash[entries[0]] = entries[1].max_by{|v| v.title.length}
end
This should give you a resultant hash like structure similar to the group_by where your keys are all your unique pdf_sha256 and the value would be the longest named disclosure that match that pdf_sha256.
Why not:
ids = Disclosure.select(:id, :pdf_sha256).distinct.map(&:id)
Disclosure.find(ids)
The id sill be distinct either way since it's the primary key, so all you have to do is map the ids and find the Disclosures by id.
If you need a relation with distinct pdf_sha256, where you require no explicit conditions, you can use group for that -
scope :unique_pdf_sha256, -> { where.not(pdf_sha256: nil).group(:pdf_sha256) }
scope :nil_pdf_sha256, -> { where(pdf_sha256: nil) }
You could have used or, but the relation passed to it must be structurally compatible. So even if you get same type of relations in these two scopes, you cannot use it with or.
Edit: To make it structurally compatible with each other you can see #AlexSantos 's answer
Model.select(:rating)
Result of this is an array of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient)
Model.uniq.pluck(:rating)
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']

Rails: how to correctly modify and save values of records in join table

I would like to understand why in Rails 4 (4.2.0) I see the following behaviour when manipulating data in a join table:
student.student_courses
returns all associated records of courses for a given user;
but the following will save changes
student.student_courses[0].status = "attending"
student.student_courses[0].save
while this will not
student.student_courses.find(1).status = "attending"
student.student_courses.find(1).save
Why is that, why are those two working differently, is the first one the correct way to do it ?
student.student_courses[0] and student.student_courses.find(1) are subtly different things.
When you say student.student_courses, you're just building a query in an ActiveRecord::Relation. Once you do something to that query that requires a trip to the database, the data is retrieved. In your case, that something is calling [] or find. When you call []:
student.student_courses[0]
your student will execute the underlying query and stash all the student_courses somewhere. You can see this by looking at:
> student.student_courses[0].object_id
# and again...
> student.student_courses[0].object_id
# same number is printed twice
But if you call find, only one object is retrieved and a new one is retrieved each time:
> student.student_courses.find(1).object_id
# and again...
> student.student_courses.find(1).object_id
# two different numbers are seen
That means that this:
student.student_courses[0].status = "attending"
student.student_courses[0].save
is the same as saying:
c = student.student_courses[0]
c.status = "attending"
c.save
whereas this:
student.student_courses.find(1).status = "attending"
student.student_courses.find(1).save
is like this:
c1 = student.student_courses.find(1)
c1.status = "attending"
c2 = student.student_courses.find(1)
c2.save
When you use the find version, you're calling status= and save on entirely different objects and since nothing was actually changed in the one that you save, the save doesn't do anything useful.
student_courses is an ActiveRecord::Relation, basically a key => value store. The find method would only work on a model

querying active record

i am trying to query my postgres db from rails with the following query
def is_manager(team)
User.where("manager <> 0 AND team_id == :team_id", {:team_id => team.id})
end
this basically is checking that the manager is flagged and the that team.id is the current id passed into the function.
i have the following code in my view
%td= is_manager(team)
error or what we are getting return is
#<ActiveRecord::Relation:0xa3ae51c>
any help on where i have gone wrong would be great
Queries to ActiveRecord always return ActiveRecord::Relations. Doing so essentially allows the lazy loading of queries. To understand why this is cool, consider this:
User.where(manager: 0).where(team_id: team_id).first
In this case, we get all users who aren't managers, and then we get all the non-manager users who are on team with id team_id, and then we select the first one. Executing this code will give you a query like:
SELECT * FROM users WHERE manager = 0 AND team_id = X LIMIT 1
As you can see, even though there were multiple queries made in our code, ActiveRecord was able to squish all of that down into one query. This is done through the Relation. As soon as we need to actual object (i.e. when we call first), then ActiveRecord will go to the DB to get the records. This prevents unnecessary queries. ActiveRecord is able to do this because they return Relations, instead of the queried objects. The best way to think of the Relation class is that it is an instance of ActiveRecord with all the methods of an array. You can call queries on a relation, but you can also iterate over it.
Sorry if that isn't clear.
Oh, and to solve your problem. %td = is_manager(team).to_a This will convert the Relation object into an array of Users.
Just retrieve first record with .first, this might help.
User.where("manager <> 0 AND team_id == :team_id", {:team_id => team.id}).first

Resources