Converting SQL query into Custom Relations Query in Rails - ruby-on-rails

I am trying to build a simple thesaurus app in Rails, in which a word in a table of words would be in a has-many, self-joined relationship to other words in the table, through a joiner table of synonym-pairs.
My SynonymPair class is built as follows:
class SynonymPair < ActiveRecord::Base
belongs_to :word1, class_name: :Word
belongs_to :word2, class_name: :Word
end
A crucial aspect of this thesaurus program is that it should not matter whether a word is in the word1 or word2 column; word1 is a synonym of word2, and vice versa.
In order for my Words class to return the SynonymPairs and Synonyms of a given word, I wrote a SQL query:
class Word < ActiveRecord::Base
def synonym_pairs
#joins :synonym_pairs and :words where either word1_id OR word2_id matches word.id.
sql = <<-SQL
SELECT synonym_pairs.id, synonym_pairs.word1_id, synonym_pairs.word2_id, words.word FROM synonym_pairs
JOIN words ON synonym_pairs.word1_id = words.id WHERE words.word = ?
UNION SELECT synonym_pairs.id, synonym_pairs.word1_id, synonym_pairs.word2_id, words.word FROM synonym_pairs
JOIN words ON synonym_pairs.word2_id = words.id WHERE words.word = ?
SQL
#returns synonym_pair objects from the result of sql query
DB[:conn].execute(sql,self.word,self.word).map do |element|
SynonymPair.find(element[0])
end
end
def synonyms
self.synonym_pairs.map do |element|
if element.word1 == self
element.word2
else
element.word1
end
end
end
end
This code works as intended. However, it does not take advantage of association models in ActiveRecord. So, I was wondering it would be possible to write a has_many :synonyms_pairs/has_many :synonyms through: :synonym-pairs custom relation query in the Words class, rather than writing out an entire SQL query, as I did above. In other words, I'm curious if it's possible to convert my SQL query into a Rails custom relations query.
Note, I tried the following custom relations query:
class Word < ActiveRecord::Base
has_many :synonym_pairs, ->(word) { where("word1_id = ? OR word2_id = ?", word.id, word.id) }
has_many :synonyms, through: :synonym_pairs
end
But, after passing a few Word/SynonymPair seeds, it returned a 'ActiveRecord:Associations:CollectionProxy' when I tried getting I called word#synonym_pairs and the following error when I called word#synonyms:
[17] pry(main)> w2 = Word.create(word: "w2")
=> #<Word:0x00007ffd522190b0 id: 7, word: "w2">
[18] pry(main)> sp1 = SynonymPair.create(word1:w1, word2:w2)
=> #<SynonymPair:0x00007ffd4fea2230 id: 6, word1_id: 6, word2_id: 7>
[19] pry(main)> w1.synonym_pairs
=> #<SynonymPair::ActiveRecord_Associations_CollectionProxy:0x3ffea7f783e4>
[20] pry(main)> w1.synonyms
ActiveRecord::HasManyThroughSourceAssociationNotFoundError: Could not find the source association(s) "synonym" or :synonyms in model SynonymPair. Try 'has_many :synonyms, :through => :synonym_pairs, :source => <name>'. Is it one of word1 or word2?
Any other ideas for getting a custom relation query, or any sort of self-join model working here?

Instead of a table of synonym pairs you can just create a standard M2M join table:
class Word
has_many :synonymities
has_many :synonyms, though: :synonymities
end
class Synonymity
belongs_to :word
belongs_to :synonym, class_name: 'Word'
end
class CreateSynonymities < ActiveRecord::Migration[6.0]
def change
create_table :synonymities do |t|
t.belongs_to :word, null: false, foreign_key: true
t.belongs_to :synonym, null: false, foreign_key: { to_table: :words }
end
end
end
While this solution would require twice as many rows in the join table it might be well worth the tradeoff as dealing with relations where the foreign keys are not fixed is a nightmare in ActiveRecord. This just works.
AR does not really let you provide the join sql when using .eager_load and .includes and loading records with a custom query and getting AR to make sense if the results and treat the associations as loaded to avoid n+1 query issues can be extremely hacky and time consuming. Sometimes you just have to build your schema around AR rather then trying to beat it into submission.
You would setup a synonym relationship between two words with:
happy = Word.create!(text: 'Happy')
jolly = Word.create!(text: 'Jolly')
# wrapping this in a single transaction is slightly faster then two transactions
Synonymity.transaction do
happy.synonyms << jolly
jolly.synonyms << happy
end
irb(main):019:0> happy.synonyms
Word Load (0.3ms) SELECT "words".* FROM "words" INNER JOIN "synonymities" ON "words"."id" = "synonymities"."synomym_id" WHERE "synonymities"."word_id" = $1 LIMIT $2 [["word_id", 1], ["LIMIT", 11]]
=> #<ActiveRecord::Associations::CollectionProxy [#<Word id: 2, text: "Jolly", created_at: "2020-07-06 09:00:43", updated_at: "2020-07-06 09:00:43">]>
irb(main):020:0> jolly.synonyms
Word Load (0.3ms) SELECT "words".* FROM "words" INNER JOIN "synonymities" ON "words"."id" = "synonymities"."synomym_id" WHERE "synonymities"."word_id" = $1 LIMIT $2 [["word_id", 2], ["LIMIT", 11]]
=> #<ActiveRecord::Associations::CollectionProxy [#<Word id: 1, text: "Happy", created_at: "2020-07-06 09:00:32", updated_at: "2020-07-06 09:00:32">]>

If you really want to setup associations where the record can be in either column on the join table you need one has_many association and one indirect association for each potential foreign key.
Bear with me here as this gets really crazy:
class Word < ActiveRecord::Base
has_many :synonym_pairs_as_word_1,
class_name: 'SynonymPair',
foreign_key: 'word_1'
has_many :synonym_pairs_as_word_2,
class_name: 'SynonymPair',
foreign_key: 'word_2'
has_many :word_1_synonyms,
through: :synonym_pairs_as_word_1,
class_name: 'Word',
source: :word_2
has_many :word_2_synonyms,
through: :synonym_pairs_as_word_2,
class_name: 'Word',
source: :word_1
def synonyms
self.class.where(id: word_1_synonyms).or(id: word_2_synonyms)
end
end
Since synonyms here still is not really an association you still have a potential n+1 query issue if you are loading a list of words and their synonyms.
While you can eager load word_1_synonyms and word_2_synonyms and combine them (by casting into arrays) this poses a problem if you need to order the records.

You are probably looking for the scope ActiveRecord class method:
class SynonymPair < ActiveRecord::Base
belongs_to :word1, class_name: :Word
belongs_to :word2, class_name: :Word
scope :with_word, -> (word) { where(word1: word).or(where(word2: word)) }
end
class Word < ActiveRecord::Base
scope :synonyms_for, -> (word) do
pairs = SynonymPair.with_word(word)
where(id: pairs.select(:word1_id)).where.not(id: word.id).or(
where(id: pairs.select(:word2_id)).where.not(id: word.id))
end
def synonyms
Word.synonyms_for(self)
end
end

Related

Query deeply nested relations in rails

We have a lot of through relations in a model. Rails correctly joins the relations, however I am struggling in figuring out how to apply a where search to the joined table using active record.
For instance:
class Model
has_one :relation1
has_one :relation2, through: :relation1
has_one :relation3, through: :relation2
end
If all the relations are different models, we easily query using where. The issue arise rails starts aliasing the models.
For instance, Model.joins(:relation3).where(relation3: {name: "Hello"}) wont work, as no table is aliased relation3.
Is it possible using active record, or would I have to achieve it using arel or sql?
I am using rails 6.0.4.
In a simple query where a table is only referenced once there is no alias and the table name is just used:
irb(main):023:0> puts City.joins(:country).where(countries: { name: 'Portugal'})
City Load (0.7ms) SELECT "cities".* FROM "cities" INNER JOIN "regions" ON "regions"."id" = "cities"."region_id" INNER JOIN "countries" ON "countries"."id" = "regions"."country_id" WHERE "countries"."name" = $1 [["name", "Portugal"]]
In a more complex scenario where a table is referenced more then once the scheme seems to be association_name_table_name and association_name_table_name_join.
class Pet < ApplicationRecord
has_many :parenthoods_as_parent,
class_name: 'Parenthood',
foreign_key: :parent_id
has_many :parenthoods_as_child,
class_name: 'Parenthood',
foreign_key: :child_id
has_many :parents, through: :parenthoods_as_child
has_many :children, through: :parenthoods_as_child
end
class Parenthood < ApplicationRecord
belongs_to :parent, class_name: 'Pet'
belongs_to :child, class_name: 'Pet'
end
irb(main):014:0> puts Pet.joins(:parents, :children).to_sql
# auto-formatted edited for readibility
SELECT "pets".*
FROM "pets"
INNER JOIN "parenthoods"
ON "parenthoods"."child_id" = "pets"."id"
INNER JOIN "pets" "parents_pets"
ON "parents_pets"."id" = "parenthoods"."parent_id"
INNER JOIN "parenthoods" "parenthoods_as_children_pets_join"
ON "parenthoods_as_children_pets_join"."child_id" = "pets"."id"
INNER JOIN "pets" "children_pets"
ON "children_pets"."id" =
"parenthoods_as_children_pets_join"."child_id"
For more advanced queries you often need to write your own joins with Arel or strings if you need to reliably know the aliases used.

ActiveRecord custom has_one relations

I'm using Rails 5.0.0.1 ATM and i've come across issue with ActiveRecord relations when optimizing count of my DB requests.
Right now I have:
Model A (let's say 'Orders'), Model B ('OrderDispatches'), Model C ('Person') and Model D ('PersonVersion').
Table 'people' consists only of 'id' and 'hidden' flag, rest of the people data sits in 'person_versions' ('name', 'surname' and some things that can change over time, like scientific title).
Every Order has 'receiving_person_id' as for the person which recorded order in DB and every OrderDispatch has 'dispatching_person_id' for the person, which delivered order. Also Order and OrderDispatch have creation time.
One Order has many dispatches.
The straightforward relations thus is:
has_many :receiving_person, through: :person, foreign_key: "receiving_person_id", class_name: 'PersonVersion'
But when I list my order with according dispatches I have to deal with N+1 situation, because to find accurate (according to the creation date of Order/OrderDispatch) PersonVersion for every receiving_person_id and dispatching_person_id I'm making another requests.
SELECT *
FROM person_versions
WHERE effective_date_from <= ? AND person_id = ?
ORDER BY effective_date_from
LIMIT 1
First '?' is Order/OrderDispatch creation date and second '?' is receiving/ordering person id.
Using this query I'm getting accurate person data for the time of Order/OrderDispatch creation.
It's fairly easy to write query with subquery (or subqueries, as Order comes with OrderDispatches on one list) in raw SQL, but I have no idea how to do that using ActiveRecord.
I tried to write custom has_one relation as this is as far as I've come:
has_one :receiving_person. -> {
where("person_versions.id = (
SELECT id
FROM person_versions sub_pv1
WHERE sub_pv1.date_from <= orders.receive_date
AND sub_pv1.person_id = orders.receiving_person_id
LIMIT 1)")},
through: :person, class_name: "PersonVersion", primary_key: "person_id", source: :person_version
It works if I use this only for receiving or dispatching person. When I try to eager_load this for joined orders and order_dispatches tables then one of 'person_versions' has to be aliased and in my custom where clause it isn't (no way to predict if it's gonna be aliased or not, it's used both ways).
Different aproach would be this:
has_one :receiving_person, -> {
where(:id => PersonVersion.where("
person_versions.date_from <= orders.receive_date
AND person_versions.person_id = orders.receiving_person_id").order(date_from: :desc).limit(1)},
through: :person, class_name: "PersonVersion", primary_key: "person_id", source: :person_version
Raw 'person_versions' in where is OK, because it's in subquery and using symbol ':id' makes raw SQL get correct aliases for person_versions table joined to orders and order_dispatches, but I get 'IN' instead of 'eqauls' for person_versions.id xx subquery and MySQL can't do LIMIT in subqueries which are used with IN/ANY/ALL statements, so I just get random person_version.
So TL;DR I need to transform 'has_many through' to 'has_one' using custom 'where' clause which looks for newest record amongst those which date is lower than originating record creation.
EDIT: Another TL;DR for simplification
def receiving_person
receiving_person_id = self.receiving_person_id
receive_date = self.receive_date
PersonVersion.where(:person_id => receiving_person_id, :hidden => 0).where.has{date_from <= receive_date}.order(date_from: :desc, id: :desc).first
end
I need this method converted to 'has_one' relation so that i could 'eager_load' this.
I would change your schema as it's conflicting with your business domain, restructuring it would alleviate your n+1 problem
class Person < ActiveRecord::Base
has_many :versions, class_name: PersonVersion, dependent: :destroy
has_one :current_version, class_name: PersonVersion
end
class PersonVersion < ActiveRecord::Base
belongs_to :person, inverse_of: :versions,
default_scope ->{
order("person_versions.id desc")
}
end
class Order < ActiveRecord::Base
has_many :order_dispatches, dependent: :destroy
end
class OrderDispatch < ActiveRecord::Base
belongs_to :order
belongs_to :receiving_person_version, class_name: PersonVersion
has_one :receiving_person, through: :receiving_person_version
end

Polymorphic Association On UUID and Integer Fields

Given tables with integer and uuid primary keys what is the best way to integrate a polymorphic join (has_many)? For example:
class Interest < ActiveRecord::Base
# id is an integer
has_many :likes, as: :likeable
end
class Post < ActiveRecord::Base
# id is a UUID
has_many :likes, as: :likeable
end
class User < ActiveRecord::Base
has_many :likes
has_many :posts, through: :likes, source: :likeable, source_type: "Post"
has_many :interests, through: :likes, source: :likeable, source_type: "Interest"
end
class Like < ActiveRecord::Base
# likeable_id and likeable_type are strings
belongs_to :likeable, polymorphic: true
belongs_to :user
end
Many queries work:
interest.likes
post.likes
user.likes
However:
user.interests
Gives:
PG::UndefinedFunction: ERROR: operator does not exist: integer = character varying
LINE 1: ...interests" INNER JOIN "likes" ON "interests"."id" = "likes"....
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
: SELECT "interests".* FROM "interests" INNER JOIN "likes" ON "interests"."id" = "likes"."likeable_id" WHERE "likes"."user_id" = $1 AND "likes"."likeable_type" = $2
What's the best way to include ensure the proper casting happens?
This is an old question, but here's my recommendation.
This is more of an architecture problem. Don't combine UUID ids and integer ids, it get's messy real fast. If you can, migrate the integer IDs to UUID or revert the uuids to integer ids.
My experience has been that the best solution is probably to make use of the rather nice Friendly ID gem: https://github.com/norman/friendly_id
In the off case this is broken in the future, it is basically just a slug generation/managemnet tool, the slug would use this kind of route path: posts/this-is-a-potential-slug instead of posts/1, but nothing prevents you from using posts/<UUID here> or posts/<alphanumeric string here>.
Typically if you are using UUIDs it's because you don't want to show the sequential integers. Friendly ID works well to avoid that issue.
There's no means to specify the necessary cast using Rails. Instead, add a generated column with the cast, and declare an extra belongs_to association to use it. For example, with this in a migration:
add_column :interests, :_id_s, 'TEXT GENERATED ALWAYS AS (id::text) STORED'
add_index :interests, :_id_s
and this in your models:
class Like
belongs_to :_likeable_cast, polymorphic: true, primary_key: :_id_s, foreign_key: :likeable_id, foreign_type: :likeable_type
class User
has_many :interests, through: :likes, source: :_likeable_cast, source_type: "Interest"
then user.interests joins through the alternative association, i.e. using the generated column with the cast.
I suggest using a column type of text rather than varchar for the likeable_id column, to avoid unnecessary conversions during the join and ensure the index is used.
Can you describe your likes table? I suppose that it contains
user_id as integer,
likeable_id as integer,
likeable_type as integer
any third-part fields
So, technically you can not create the same polymorphic association with uuid string and id as integer in scope of two fields likeable_id and likeable_type.
As solution - you can simply add id as primary key to posts table instead of uuid. In case if you maybe do not want to show id of post in URL, or for another security reasons - you can still use uuid as before.
You might be able to define your own method to retrieve likes in your Interest model.
def likes
Like.where("likeable_type = ? AND likeable_id = ?::text", self.class.name, id)
end
The problem with this solution is that you're not defining the association, so something like 'has_many through' won't work, you'd have to define those methods/queries yourself as well.
Have you considered something like playing around with typecasting the foreign- or primary-key in the association macro? E.g. has_many :likes, foreign_key: "id::UUID" or something similar.
Tested on Rails 6.1.4
Having a likeable_id as string works well and rails takes care of the casting of IDs.
Here is an example of my code
Migration for adding polymorphic "owner" to timeline_event model
class AddOwnerToTimelineEvent < ActiveRecord::Migration[6.1]
def change
add_column :timeline_events, :owner_type, :string, null: true
add_column :timeline_events, :owner_id, :string, null: true
end
end
Polymorphic model
class TimelineEvent < ApplicationRecord
belongs_to :owner, polymorphic: true
end
Now we have 2 owner, Contact which has id as Bigint and Company which has id as uuid, you could see in the SQL that rails has already casted them to strings
contact.timeline_events
TimelineEvent Load (5.8ms) SELECT "timeline_events"."id", "timeline_events"."at_time",
"timeline_events"."created_at", "timeline_events"."updated_at",
"timeline_events"."owner_type", "timeline_events"."owner_id" FROM
"timeline_events" WHERE "timeline_events"."owner_id" = $1 AND
"timeline_events"."owner_type" = $2 [["owner_id", "1"],
["owner_type", "Contact"]]
company.timeline_events
TimelineEvent Load (1.3ms) SELECT "timeline_events"."id", "timeline_events"."action",
"timeline_events"."at_time", "timeline_events"."created_at",
"timeline_events"."updated_at", "timeline_events"."owner_type",
"timeline_events"."owner_id" FROM "timeline_events" WHERE
"timeline_events"."owner_id" = $1 AND "timeline_events"."owner_type" =
$2 [["owner_id", "0b967b7c-8b15-4560-adac-17a6970a4274"],
["owner_type", "Company"]]
There is a caveat though when you are loading timeline_events for a particular owner type and rails cannot do the type casting for you
have to do the casting yourself. for e.g. loading timelines where owner is a Company
TimelineEvent.where(
"(owner_type = 'Company' AND uuid(owner_id) in (:companies))",
companies: Company.select(:id)
)
I'm not good with ActiveRecord, and this is definitely not the answer you're looking for, but if you need a temporary *ugly workaround till you can find a solution, you could override the getter :
class User
def interests
self.likes.select{|like| like.likeable._type == 'Interest'}.map(&:likeable)
end
end
*Very ugly cause it will load all the user likes and then sort them
EDIT I found this interesting article :
self.likes.inject([]) do |result, like|
result << like.likeable if like.likeable._type = 'Interest'
result
end

Rails Associations has_one Latest Record

I have the following model:
class Section < ActiveRecord::Base
belongs_to :page
has_many :revisions, :class_name => 'SectionRevision', :foreign_key => 'section_id'
has_many :references
has_many :revisions, :class_name => 'SectionRevision',
:foreign_key => 'section_id'
delegate :position, to: :current_revision
def current_revision
self.revisions.order('created_at DESC').first
end
end
Where current_revision is the most recently created revision. Is it possible to turn current_revision into an association so I can perform query like Section.where("current_revision.parent_section_id = '1'")? Or should I add a current_revision column to my database instead of trying to create it virtually or through associations?
To get the last on a has_many, you would want to do something similar to #jvnill, except add a scope with an ordering to the association:
has_one :current_revision, -> { order created_at: :desc },
class_name: 'SectionRevision', foreign_key: :section_id
This will ensure you get the most recent revision from the database.
You can change it to an association but normally, ordering for has_one or belongs_to association are always interpreted wrongly when used on queries. In your question, when you turn that into an association, that would be
has_one :current_revision, class_name: 'SectionRevision', foreign_key: :section_id, order: 'created_at DESC'
The problem with this is that when you try to combine this with other queries, it will normally give you the wrong record.
>> record.current_revision
# gives you the last revision
>> record.joins(:current_revision).where(section_revisions: { id: 1 })
# searches for the revision where the id is 1 ordered by created_at DESC
So I suggest you to add a current_revision_id instead.
As #jvnill mentions, solutions using order stop working when making bigger queries, because order's scope is the full query and not just the association.
The solution here requires accurate SQL:
has_one :current_revision, -> { where("NOT EXISTS (select 1 from section_revisions sr where sr.id > section_revisions.id and sr.section_id = section_revisions.section_id LIMIT 1)") }, class_name: 'SectionRevision', foreign_key: :section_id
I understand you want to get the sections where the last revision of each section has a parent_section_id = 1;
I have a similar situation, first, this is the SQL (please think the categories as sections for you, posts as revisions and user_id as parent_section_id -sorry if I don't move the code to your need but I have to go):
SELECT categories.*, MAX(posts.id) as M
FROM `categories`
INNER JOIN `posts`
ON `posts`.`category_id` = `categories`.`id`
WHERE `posts`.`user_id` = 1
GROUP BY posts.user_id
having M = (select id from posts where category_id=categories.id order by id desc limit 1)
And this is the query in Rails:
Category.select("categories.*, MAX(posts.id) as M").joins(:posts).where(:posts => {:user_id => 1}).group("posts.user_id").having("M = (select id from posts where category_id=categories.id order by id desc limit 1)")
This works, it is ugly, I think the best way is to "cut" the query, but if you have too many sections that would be a problem while looping trough them; you can also place this query into a static method, and also, your first idea, have a revision_id inside of your sections table will help to optimize the query, but will drop normalization (sometimes it is needed), and you will have to be updating this field when a new revision is created for that section (so if you are going to be making a lot of revisions in a huge database it maybe would be a bad idea if you have a slow server...)
UPDATE
I'm back hehe, I was making some tests, and check this out:
def last_revision
revisions.last
end
def self.last_sections_for(parent_section_id)
ids = Section.includes(:revisions).collect{ |c| c.last_revision.id rescue nil }.delete_if {|x| x == nil}
Section.select("sections.*, MAX(revisions.id) as M")
.joins(:revisions)
.where(:revisions => {:parent_section_id => parent_section_id})
.group("revisions.parent_section_id")
.having("M IN (?)", ids)
end
I made this query and worked with my tables (hope I named well the params, it is the same Rails query from before but I change the query in the having for optimization); watch out the group; the includes makes it optimal in large datasets, and sorry I couldn't find a way to make a relation with has_one, but I would go with this, but also reconsider the field that you mention at the beginning.
If your database supports DISTINCT ON
class Section < ApplicationRecord
has_one :current_revision, -> { merge(SectionRevision.latest_by_section) }, class_name: "SectionRevision", inverse_of: :section
end
class SectionRevision < ApplicationRecord
belongs_to: :section
scope :latest_by_section, -> do
query = arel_table
.project(Arel.star)
.distinct_on(arel_table[:section_id])
.order(arel_table[:section_id].asc, arel_table[:created_at].desc)
revisions = Arel::Nodes::TableAlias.new(
Arel.sql(format("(%s)", query.to_sql)), arel_table.name
)
from(revisions)
end
end
It works with preloading
Section.includes(:current_revision)

ActiveRecord::Relation cannot use named association in where clause of join

How do I use a named association in the where clause associated with a join?
class Pet < ActiveRecord::Base
belongs_to :owner
end
class Owner < ActiveRecord::Base
has_many :dogs, :class_name => 'Pet', :foreign_key => :owner_id
end
Owner.joins(:dogs).where(:dogs => {:name => 'fido'}).to_sql
generates:
"SELECT `owners`.* FROM `owners` INNER JOIN `pets` ON `pets`.`owner_id` = `owners`.`id` WHERE (`dogs`.`name` = 'fido')"
Note that the WHERE clause is looking in the dogs table instead of the pets table
For reference:
http://guides.rubyonrails.org/active_record_querying.html#specifying-conditions-on-the-joined-tables
It appears this is the expected behavior - you need to specify the table name in the hash, not the association name. This is a little unfortunate because I think it'd be useful construct queries based more on their model definition and less on schema they sit in front of.

Resources