Activerecord specifications with 2 different models - ruby-on-rails

I need to find a way to display all Vacancies from my Vacancy model except the ones that a user already applied for.
I keep the IDs of the vacancies a certain user applied for in a seperate model AppliedVacancies.
I was thinking something line the lines of:
#applied = AppliedVacancies.where(employee_id: current_employee)
#appliedvacancies_id = []
#applied.each do |appliedvacancy|
#appliedvacancies_id << appliedvacancy.id
end
#notyetappliedvacancies = Vacancy.where("id != ?", #appliedvacancy_id)
But it does not seem to like getting an array of IDs. How would I go about fixing this?
I get following error:
PG::DatatypeMismatch: ERROR: argument of WHERE must be type boolean, not type record
LINE 1: SELECT "vacancies".* FROM "vacancies" WHERE (id != 13,14)
^
: SELECT "vacancies".* FROM "vacancies" WHERE (id != 13,14)

This is purely an SQL problem.
You cannot use != to compare a value to a set of values. You need to use the IN operator.
#notyetappliedvacancies = Vacancy.where("id NOT IN (?)", #appliedvacancy_id)
As an aside, you can drastically improve the code you've written so far. You are needlessly instantiating complete ActiveRecord models for every record found in your applied_vacancies table, when all you need are the IDs.
A first pass at improvement would be to use pluck to skip the entire process and go straight to the list of IDs:
ids = AppliedVacancies.where(employee_id: current_employee).pluck(:id)
#notyetappliedvacancies = Vacancy.where("id NOT IN (?)", ids)
Next, you can go a step further and eliminate the first query all together (or rather, combine it with the last query as a sub-query) by leaving it as an AREL projection which can be subbed into the second query directly:
ids = AppliedVacancies.select(:id).where(employee_id: current_employee)
#notyetappliedvacancies = Vacancy.where("id NOT IN (?)",App)
This will generate a single query:
select * from vacancies where id not in (select id from applied_vacancies where employee_id = <value>)

Answer like #meagar, but Rails 4 way:
#notyetappliedvacancies = Vacancy.where.not(id: #appliedvacancy_id)

Related

Can I force the execution of an active record query chain?

I have an edge case where I want to use .first only after my SQL query has been executed.
My case is the next one:
User.select("sum((type = 'foo')::int) as foo_count",
"sum((type = 'bar')::int) as bar_count")
.first
.yield_self { |r| r.bar_count / r.foo_count.to_f }
However, this would throw an SQL error saying that I should include my user_id in the GROUP BY clause. I've already found a hacky solution using to_a, but I really wonder if there is a proper way to force execution before my call to .first.
The error is because first uses an order by statement to order by id.
"Find the first record (or first N records if a parameter is supplied). If no order is defined it will order by primary key."
Instead try take
"Gives a record (or N records if a parameter is supplied) without any implied order. The order will depend on the database implementation. If an order is supplied it will be respected."
So
User.select("sum((type = 'foo')::int) as foo_count",
"sum((type = 'bar')::int) as bar_count")
.take
.yield_self { |r| r.bar_count / r.foo_count.to_f }
should work appropriately however as stated the order is indeterminate.
You may want to use pluck which retrieves only the data instead of select which just alters which fields get loaded into models:
User.pluck(
"sum((type = 'foo')::int) as foo_count",
"sum((type = 'bar')::int) as bar_count"
).map do |foo_count, bar_count|
bar_count / foo_count.to_f
end
You can probably do the division in the query as well if necessary.

How to make ActiveRecord query unique by a column

I have a Company model that has many Disclosures. The Disclosure has columns named title, pdf and pdf_sha256.
class Company < ActiveRecord::Base
has_many :disclosures
end
class Disclosure < ActiveRecord::Base
belongs_to :company
end
I want to make it unique by pdf_sha256 and if pdf_sha256 is nil that should be treated as unique.
If it is an Array, I'll write like this.
companies_with_sha256 = company.disclosures.where.not(pdf_sha256: nil).group_by(&:pdf_sha256).map do |key,values|
values.max_by{|v| v.title.length}
end
companies_without_sha256 = company.disclosures.where(pdf_sha256: nil)
companies = companies_with_sha256 + companeis_without_sha256
How can I get the same result by using ActiveRecord query?
It is possible to do it in one query by first getting a different id for each different pdf_sha256 as a subquery, then in the query getting the elements within that set of ids by passing the subquery as follows:
def unique_disclosures_by_pdf_sha256(company)
subquery = company.disclosures.select('MIN(id) as id').group(:pdf_sha256)
company.disclosures.where(id: subquery)
.or(company.disclosures.where(pdf_sha256: nil))
end
The great thing about this is that ActiveRecord is lazy loaded, so the first subquery will not be run and will be merged to the second main query to create a single query in the database. It will then retrieve all the disclosures unique by pdf_sha256 plus all the ones that have pdf_sha256 set to nil.
In case you are curious, given a company, the resulting query will be something like:
SELECT "disclosures".* FROM "disclosures"
WHERE (
"disclosures"."company_id" = $1 AND "disclosures"."id" IN (
SELECT MAX(id) as id FROM "disclosures" WHERE "disclosures"."company_id" = $2 GROUP BY "disclosures"."pdf_sha256"
)
OR "disclosures"."company_id" = $3 AND "disclosures"."pdf_sha256" IS NULL
)
The great thing about this solution is that the returned value is an ActiveRecord query, so it won't be loaded until you actually need. You can also use it to keep chaining queries. Example, you can select only the id instead of the whole model and limit the number of results returned by the database:
unique_disclosures_by_pdf_sha256(company).select(:id).limit(10).each { |d| puts d }
You can achieve this by using uniq method
Company.first.disclosures.to_a.uniq(&:pdf_sha256)
This will return you the disclosures records uniq by cloumn "pdf_sha256"
Hope this helps you! Cheers
Assuming you are using Rails 5 you could chain a .or command to merge both your queries.
pdf_sha256_unique_disclosures = company.disclosures.where(pdf_sha256: nil).or(company.disclosures.where.not(pdf_sha256: nil))
Then you can proceed with your group_by logic.
However, in the example above i'm not exactly sure what is the objective but I am curious to better understand how you would use the resulting companies variable.
If you wanted to have a hash of unique pdf_sha256 keys including nil, and its resultant unique disclosure document you could try the following:
sorted_disclosures = company.disclosures.group_by(&:pdf_sha256).each_with_object({}) do |entries, hash|
hash[entries[0]] = entries[1].max_by{|v| v.title.length}
end
This should give you a resultant hash like structure similar to the group_by where your keys are all your unique pdf_sha256 and the value would be the longest named disclosure that match that pdf_sha256.
Why not:
ids = Disclosure.select(:id, :pdf_sha256).distinct.map(&:id)
Disclosure.find(ids)
The id sill be distinct either way since it's the primary key, so all you have to do is map the ids and find the Disclosures by id.
If you need a relation with distinct pdf_sha256, where you require no explicit conditions, you can use group for that -
scope :unique_pdf_sha256, -> { where.not(pdf_sha256: nil).group(:pdf_sha256) }
scope :nil_pdf_sha256, -> { where(pdf_sha256: nil) }
You could have used or, but the relation passed to it must be structurally compatible. So even if you get same type of relations in these two scopes, you cannot use it with or.
Edit: To make it structurally compatible with each other you can see #AlexSantos 's answer
Model.select(:rating)
Result of this is an array of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient)
Model.uniq.pluck(:rating)
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']

Active Record - Chain Queries with OR

Rails: 4.1.2
Database: PostgreSQL
For one of my queries, I am using methods from both the textacular gem and Active Record. How can I chain some of the following queries with an "OR" instead of an "AND":
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
I want to chain the last two scopes (fuzzy_search and the where after it) together with an "OR" instead of an "AND." So I want to retrieve all People who are approved AND (whose first name is similar to "Test" OR whose last name contains "Test"). I've been struggling with this for quite a while, so any help would be greatly appreciated!
I digged into fuzzy_search and saw that it will be translated to something like:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rankxxx"
FROM "people"
WHERE (("people"."first_name" % 'abc'))
ORDER BY "rankxxx" DESC
That says if you don't care about preserving order, it will just filter the result by WHERE (("people"."first_name" % 'abc'))
Knowing that and now you can simply write the query with similar functionality:
People.where(status: status_approved)
.where('(first_name % :key) OR (last_name LIKE :key)', key: 'Test')
In case you want order, please specify what would you like the order will be after joining 2 conditions.
After a few days, I came up with the solution! Here's what I did:
This is the query I wanted to chain together with an OR:
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
As Hoang Phan suggested, when you look in the console, this produces the following SQL:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rank69146689305952314"
FROM "people"
WHERE "people"."status" = 1 AND (("people"."first_name" % 'Test')) AND (last_name LIKE 'Test') ORDER BY "rank69146689305952314" DESC
I then dug into the textacular gem and found out how the rank is generated. I found it in the textacular.rb file and then crafted the SQL query using it. I also replaced the "AND" that connected the last two conditions with an "OR":
# Generate a random number for the ordering
rank = rand(100000000000000000).to_s
# Create the SQL query
sql_query = "SELECT people.*, COALESCE(similarity(people.first_name, :query), 0)" +
" AS rank#{rank} FROM people" +
" WHERE (people.status = :status AND" +
" ((people.first_name % :query) OR (last_name LIKE :query_like)))" +
" ORDER BY rank#{rank} DESC"
I took out all of quotation marks in the SQL query when referring to tables and fields because it was giving me error messages when I kept them there and even if I used single quotes.
Then, I used the find_by_sql method to retrieve the People object IDs in an array. The symbols (:status, :query, :query_like) are used to protect against SQL injections, so I set their values accordingly:
# Retrieve all the IDs of People who are approved and whose first name and last name match the search query.
# The IDs are sorted in order of most relevant to the search query.
people_ids = People.find_by_sql([sql_query, query: "Test", query_like: "%Test%", status: 1]).map(&:id)
I get the IDs and not the People objects in an array because find_by_sql returns an Array object and not a CollectionProxy object, as would normally be returned, so I cannot use ActiveRecord query methods such as where on this array. Using the IDs, we can execute another query to get a CollectionProxy object. However, there's one problem: If we were to simply run People.where(id: people_ids), the order of the IDs would not be preserved, so all the relevance ranking we did was for nothing.
Fortunately, there's a nice gem called order_as_specified that will allow us to retrieve all People objects in the specific order of the IDs. Although the gem would work, I didn't use it and instead wrote a short line of code to craft conditions that would preserve the order.
order_by = people_ids.map { |id| "people.id='#{id}' DESC" }.join(", ")
If our people_ids array is [1, 12, 3], it would create the following ORDER statement:
"people.id='1' DESC, people.id='12' DESC, people.id='3' DESC"
I learned from this comment that writing an ORDER statement in this way would preserve the order.
Now, all that's left is to retrieve the People objects from ActiveRecord, making sure to specify the order.
people = People.where(id: people_ids).order(order_by)
And that did it! I didn't worry about removing any duplicate IDs because ActiveRecord does that automatically when you run the where command.
I understand that this code is not very portable and would require some changes if any of the people table's columns are modified, but it works perfectly and seems to execute only one query according to the console.

How can I query a Ruby array, like I do with Rails ActiveRecord?

I have a table, let's call it Widget.
I do some complex processing to get various type of Widgets. These end up in two different variables.
To keep things simple, let's say we have...
widgetsA = Widget.where("blah blah blah")
widgetsB = Widget.where("blah blah blah blah")
We can still perform ActiveRecord functions like .where on widgetsA and widgetsB.
Now, after retrieving the sets for A and B, I need to union them, and then perform additional ActiveRecord functions on them.
I want to do something like this...
widgetsAll = widgetsA | widgetsB
widgetsAll = widgetsAll.order("RANDOM()")
widgetsAll = widgetsAll.where(answers_count: 0).limit(10) + widgetsAll.where("answers_count > 0").limit(10)
This will take all the widgets (union) found in A & B, randomize them, and select 10 with answers and 10 without answers.
The problem is, I cannot user .order and widgetsAll is no longer an ActiveRecord object, but it's an Array because of the widgetsAll = widgetsA | widgetsB line. How do I either
A) Union/Intersect two ActiveRecord sets, into an ActiveRecord set
B) How can I order and perform a 'where' style query on an Array.
Either will solve the issue. I assume B is a bit better for performance, so I suppose that would be the better answer.
Any ideas?
Lastly, lets say the Widget table has columns id, name, description. In the end we want an ActiveRecord or Array (likely preferred) of everything.
EDIT: (Attempting to combine via SQL UNION... but not working)
w1 = Widget.where("id = 1 OR id = 2")
w2 = Widget.where("id = 2 OR id = 3")
w3 = Widget.from("(#{w1.to_sql} UNION #{w2.to_sql})")
PG::SyntaxError: ERROR: subquery in FROM must have an alias
LINE 1: SELECT "widgets".* FROM (SELECT "widgets".* FROM "widge...
I see two options:
1) Do the union in SQL: Instead of widgetsA | widgetsB that return an array you can do an union in the database, so that the result is still a relation object:
Widget.from("(#{widgetA.to_sql} UNION #{widgetB.to_sql}) AS widgets")
2) Use normal array methods. Your example:
widgetsAll = widgetsAll.order("RANDOM()")
widgetsAll = widgetsAll.where(answers_count: 0).limit(10) + widgetsAll.where("answers_count > 0").limit(10)
would translate to something like this:
widgetsAll = widgetsAll.shuffle
widgetsAll = widgetsAll.select { |answer| widget.answer_count == 0 }.take(10) +
widgetsAll.select { |answer| widget.answers_count > 0).take(10)
Read more about Ruby arrays.
Using the any_of gem you could do:
widgetsAll = Widget.where.any_of(widgetsA, widgetsB)
One way would be to do as follows:
widgetsAllActual = Widget.where(id: widgetsAll)
This is creating a new Widget::ActiveRecord_Relation collection containing all the elements in widgetsA and widgetsB, and allows for making further active record scoping.
Ref: https://stackoverflow.com/a/24448317/429758

Rails: how to correctly modify and save values of records in join table

I would like to understand why in Rails 4 (4.2.0) I see the following behaviour when manipulating data in a join table:
student.student_courses
returns all associated records of courses for a given user;
but the following will save changes
student.student_courses[0].status = "attending"
student.student_courses[0].save
while this will not
student.student_courses.find(1).status = "attending"
student.student_courses.find(1).save
Why is that, why are those two working differently, is the first one the correct way to do it ?
student.student_courses[0] and student.student_courses.find(1) are subtly different things.
When you say student.student_courses, you're just building a query in an ActiveRecord::Relation. Once you do something to that query that requires a trip to the database, the data is retrieved. In your case, that something is calling [] or find. When you call []:
student.student_courses[0]
your student will execute the underlying query and stash all the student_courses somewhere. You can see this by looking at:
> student.student_courses[0].object_id
# and again...
> student.student_courses[0].object_id
# same number is printed twice
But if you call find, only one object is retrieved and a new one is retrieved each time:
> student.student_courses.find(1).object_id
# and again...
> student.student_courses.find(1).object_id
# two different numbers are seen
That means that this:
student.student_courses[0].status = "attending"
student.student_courses[0].save
is the same as saying:
c = student.student_courses[0]
c.status = "attending"
c.save
whereas this:
student.student_courses.find(1).status = "attending"
student.student_courses.find(1).save
is like this:
c1 = student.student_courses.find(1)
c1.status = "attending"
c2 = student.student_courses.find(1)
c2.save
When you use the find version, you're calling status= and save on entirely different objects and since nothing was actually changed in the one that you save, the save doesn't do anything useful.
student_courses is an ActiveRecord::Relation, basically a key => value store. The find method would only work on a model

Resources