I started to use pg_search in my rails project and I'm confused about building indexes.
I have a model with two attributes, where I do a full-text search: title and description. Currently, I have this setup:
class Argumentation < ApplicationRecord
include PgSearch
pg_search_scope :searchfor, :against => [:title, :description]
end
This works but it the query is slow. It needs to be faster and my first thought was to add an index.
There is a site dedicated to building indexes for full-text search: https://github.com/Casecommons/pg_search/wiki/Building-indexes
I want to follow the guide, but some things confuse me:
1.The guide tells me, to add a column "tsvector" to the table.
add_column :posts, :tsv_body, :tsvector
But isn't this line of code adding two columns: tsv_body and tsvector? Or is tsv_body a placeholder, for example description or title in my case?
After that, the index should be added to tsv_body like this:
add_index(:posts, :tsv_body, using: 'gin')
Then, the guide talks about triggers. Why are triggers needed? If you add an index to a string attribute, triggers aren't needed (if I'm not mistaken).
In the trigger itself, there is "pg_catalog.english".
create_trigger(compatibility: 1).on(:posts).before(:insert, :update) do
"new.tsv_body := to_tsvector('pg_catalog.english', coalesce(new.body,''));"
end
Since there are many languages in my attributes, not just english, I wonder: Is it even possible to add an index, if there are multiple languages?
0) You should ask separate questions as separate questions ;)
1) add_column :posts, :tsv_body, :tsvector
isn't this line of code adding two columns: tsv_body and tsvector?
No - add_column has several arguments for adding a single column. The first argument is the table-name, the second is the name of the new column, the third is the type of column (eg string, integer, or in this case tsvector)
2) Why are triggers needed?
A trigger is something that watches for something to happen in your database, then does something based on that. In this case, your trigger will probably be doing something like watching to see when one of your columns gets changed... and it will then go ahead and update the index with the changes - this is better than periodically running the rebuild-index over the entire table - which potentially is huge and will take a long time and will be "updating" things that didn't actually change.
3) Is it even possible to add an index, if there are multiple languages?
Yes. I'm not highly familiar with the details - but the fact that you add the language name for the index is indicative that you can have an index for the other languages too.
Related
I'm currently working on a product where for some reaasons we decided to destroy the email column from a specific table and then delegate that column to an associated table using active record delegate method
https://apidock.com/rails/Module/delegate
My question here will be about what is the approach to follow in order to make sure that all the where clauses that uses the email colum are also delegated as well. Because it's basically need a lot of time to check all the places where we used table.where(dropped_column: value)
Example
class Station < ActiveRecord::Base
has_one :contact
delegate :email, to: :contact
end
class Contact < ActiveRecord::Base
belongs_to :station
end
I know that it is possible to switch all the queries from Station.where(email: value) to Station.joins(:contact).where('contacts.email': value) but this approach will take very long and also where clause can be written in many different ways so searching throught the code source and updating all of them is not efficient enough to cover all the cases.
If anyone faced a similar situation and managed to solved in way that saves us time and bugs I will be very glad to hear what are the approaches you followed.
Thanks.
Rails version: '5.2.3'
what is the approach to follow in order to make sure that all the where clauses that uses the email column are also delegated as well
You cannot "delegate" SQL commands. You need to update them all.
I know that it is possible to switch all the queries from Station.where(email: value) to Station.joins(:contact).where('contacts.email': value) but this approach will take very long
Yep, that's what you'll need to do, sorry!
If anyone face a similar situation and managed to solved in way that saves us time and bugs I will be very glad to hear what are the approaches you followd.
Before dropping the column, you can first do this:
class Station < ActiveRecord::Base
self.ignored_columns = ['email']
has_one :contact
delegate :email, to: :contact
# ...
end
This way, you can stop the application from being able to access the column, without actually deleting it from the database (like a 'soft delete'), which makes it easy to revert.
Do this on a branch, and make all the specs green. (If you have good test coverage, you're done! But if your coverage is poor, there may be errors after deploying...)
If it goes wrong after deploying, you could revert the whole PR, or just comment out the ignored_columns again, temporarily, while you push fixes.
Then finally, once your application is running smoothly without errors, drop the column.
I'm having a trouble with opening AA edit-page for a model, which has a lot of associations.
What I had it's like 50 selects, opening at once. And this page turns to be deadly slow.
After reading this ActiveAdmin: How to handle large associations I considered to use select2 instead of usual select, but things get even worse.
That was because most of the time Rails spent in generating views, not in querying database. So with fancy select2 it reasonably spends even more time in views.
With that knowledge in mind, I decided to not have select inputs on that page at all. So I'll edit "main" object on that slow page, but connected with has_and_belongs_to_many objects should be edited separately.
But after that decision I've faced with a trouble: how should I edit tables with a complex primary key: not just id, but :person_id and :organization_id.
AA by default generates urls like that: /admin/person_organizations/:id/edit, but I need something like this: /admin/person_organizations/:person_id/:organization_id/edit
Any ideas?
ActiveAdmin should be able to handle custom primary keys by default. Just be sure that you add the definition to your model like this:
class Person < ActiveRecord::Base
self.primary_key = 'person_id'
end
After a while I've decided that I don't even need to have multiple keys here since Rails generates artificial id field for habtm tables. And as my goal was to edit this table, I've finished with standard ways of doing this.
I'm wanting to use UUIDs in an app I'm building and am running into a bit of a problem. Due to UUIDs (v4) not being sortable because they're randomly generated, I'm trying to override ActiveRecord::Base#first, but Rails isn't too pleased with that. It yells at me saying ArgumentError: You tried to define a scope named "first" on the model "Item", but Active Record already defined a class method with the same name. Do I have to use a different method if I want to sort and have it sort correctly?
Here's the sauce:
# lib/sortable_uuid.rb
module SortableUUID
def self.included(base)
base.class_eval do
scope :first, -> { order("created_at").first }
scope :last, -> { order("created_at DESC").first }
end
end
end
# app/models/item.rb
class Item < ActiveRecord::Base
include SortableUUID
end
Rails 4.2, Ruby 2.2.2
Reference:
http://blog.nakonieczny.it/posts/rails-support-for-uuid/
http://linhmtran168.github.io/blog/2014/03/17/postgres-uuid-in-rails/ ( Drawbacks section )
Rails 6 (currently in version 6.0.0rc1) comes to rescue with implicit_order_column!
To order by created_at and make .first, .last, .second etc. respect it is as simple as:
class ApplicationRecord < ActiveRecord::Base
self.implicit_order_column = :created_at
end
First of all, first and last aren't as simple as you seem to think they are: you're completely neglecting the limit argument that both of those methods support.
Secondly, scope is little more than a fancy way of adding class methods that are intended to return queries. Your scopes are abusing scope because they return single model instances rather than queries. You don't want to use scope at all, you're just trying to replace the first and last class methods so why don't you just override them? You'd need to override them properly though and that will require reading and understanding the Rails source so that you properly mimic what find_nth_with_limit does. You'd want to override second, third, ... and the rest of those silly methods while you're at it.
If you don't feel right about replace first and last (a good thing IMO), then you could add a default scope to order things as desired:
default_scope -> { order(:created_at) }
Of course, default scopes come with their own set of problems and sneaking things into the ORDER BY like this will probably force you into calling reorder any time you actually want to specify the ORDER BY; remember that multiple calls to order add new ordering conditions, they don't replace one that's already there.
Alternatively, if you're using Rails6+, you can use Markus's implicit_order_column solution to avoid all the problems that default scopes can cause.
I think you're going about this all wrong. Any time I see M.first I assume that something has been forgotten. Ordering things by id is pretty much useless so you should always manually specify the order you want before using methods like first and last.
After replacing id with uuid, I experienced some weirdness in the way associations were allocating foreign keys, and it wasn't that .last and .first, but instead because I simply forgot to add default: 'gen_random_uuid()' to one of the tables using a uuid. Once I fixed that, the problem was solved.
create_table :appointments, id: :uuid, default: 'gen_random_uuid()' do |t|
Using Rails 2.3.5.
In acts_as_audited, the schema definition defines an index:
add_index :audits, [:auditable_id,
:auditable_type], :name =>
'auditable_index'
It seems to me that the index should be:
add_index :audits, [:auditable_type,
:auditable_id], :name =>
'auditable_index'
In general, in a polymorphic association, we might sometimes want to search by the type only, but hardly ever search by the ID without the type?
Or is this a lazy way to allow a search by auditable_id when you are only using the plugin to audit one table?
Or is there another reason to do the indexing this way?
The answer just occurred to me.
Some databases don't support multi-field indexing,and those databases will index only the first field. If you go with my alternate indexing, then you'd get data clustered by class name, on which the database will have to do a sequential scan to find a particular ID. That's bound to be slower than searching for IDs and then checking the class name.
Another reason, I've found, is that SQL optimizers tend to not figure out to use an index when the first field in the index is not specific enough, apparently.
Lets say you have a model like the following:
class Stock < ActiveRecord::Base
# Positions
BUY = 1
SELL = 2
end
And in that class as an attribute of type integer called 'position' that can hold any of the above values. What is the Rails best practice for converting those integer values into human readable strings?
a) Use a helper method, but then you're force to make sure that you keep the helper method and model in sync
def stock_position_to_s(position)
case position
when Stock::BUY
'buy'
when Stock::SELL
'sell'
end
''
end
b) Create a method in the model, which sort of breaks a clean MVC approach.
class Stock < ActiveRecord::Base
def position_as_string
...snip
end
end
c) A newer way using the new I18N stuff in Rails 2.2?
Just curious what other people are doing when they have an integer column in the database that needs to be output as a user friendly string.
Thanks,
Kenny
Sounds to me like something that belongs in the views as it is a presentation issue.
If it is used widely, then in a helper method for DRY purposes, and use I18N if you need it.
Try out something like this
class Stock < ActiveRecord::Base
##positions => {"Buy" => 1, "Sell" => 2}
cattr_reader :positions
validates_inclusion_of :position, :in => positions.values
end
It lets you to save position as an integer, as well as use select helpers easily.
Of course, views are still a problem. You might want to either use helpers or create position_name for this purpose method
class Stock < ActiveRecord::Base
##positions => {"Buy" => 1, "Sell" => 2}
cattr_reader :positions
validates_inclusion_of :position, :in => positions.values
def position_name
positions.index(position)
end
end
Is there a good reason for the app be converting the integer to the human readable string programmatically?
I would make the positions objects which have a position integer attribute and a name attribute.
Then you can just do
stock.position.name
#HermanD: I think it's a lot better to store the values in an integer column rather than a string column for numerous reasons.
It saves database space.
Easier/faster to index on an integer than a string.
Your not hard coding a human readable string as values in a database. (What happens if the client says that "Buy" should become "Purchase"? Now the UI shows "Purchase" everywhere but you need to keep setting "Buy" in the database.)
So, if you store certain values in the database as integers, then at some point, you're going to need to show them to the user as strings, and I think the only way you can do that is programatically.
You could move this info into another object but, IMHO, I'd say this is overkill. You'd then have to add another database table. Add another 'admin' section for adding, removing and renaming these values and so on. Not to mention that if you had several columns, in different models that needed this behavior, you'd either have to create lots of these objects (ex: stock_positions, stock_actions, transaction_kinds, etc...) or you'd have to design it generically enough to use polymorphic associations. Finally, if the position name is hard coded, then you lose the ability to easy localize it at a later date.
#frankodwyer: I'd have to agree that using a helper method is probably the best way to go. I was hoping their might be a "slicker" way to do this, but it doesn't look like it. For now, I think the best method is to create a new helper module, maybe something like StringsHelper, and stuff a bunch of methods in their for converting model constants to strings. That way I can use all the I18N stuff in the helper to pull out the localized string if I need to in the future. The annoying part is that if someone needs to add a new value to the models column, then they will also have to add a check for that in the helper. Not 100% DRY, but I guess "close enough"...
Thanks to both of you for the input.
Kenny
Why not use the properties of a native data structure? example:
class Stock < ActiveRecord::Base
ACTIONS = [nil,'buy','sell']
end
Then you could grab them using Stock::ACTIONS[1] #=> 'buy' or Stock::ACTIONS[2] #=> 'sell'
or, you could use a hash {:buy => 1, :sell => 2} and access it as Stock::ACTIONS[:buy] #=> 1
you get the idea.
#Derek P. That's the implementation I first went with and while it definitely works, it sort of breaks the MVC metaphor because the model, now has view related info defined in its class. Strings in controllers are one thing, but strings in models (in my opinion) are definitely against the spirit of clean MVC.
It also doesn't really work if you want to start localizing, so while it was the method I originally used, I don't think it's the method for future development (and definitely not in an I18N world.)
Thanks for the input though.
Sincerely,
Kenny
I wrote a plugin that may help a while ago. See this. It lets you define lists and gives you nice methods ending in _str for display purposes.