Rails acts_as_audited - why does it index by ID first? - ruby-on-rails

Using Rails 2.3.5.
In acts_as_audited, the schema definition defines an index:
add_index :audits, [:auditable_id,
:auditable_type], :name =>
'auditable_index'
It seems to me that the index should be:
add_index :audits, [:auditable_type,
:auditable_id], :name =>
'auditable_index'
In general, in a polymorphic association, we might sometimes want to search by the type only, but hardly ever search by the ID without the type?
Or is this a lazy way to allow a search by auditable_id when you are only using the plugin to audit one table?
Or is there another reason to do the indexing this way?

The answer just occurred to me.
Some databases don't support multi-field indexing,and those databases will index only the first field. If you go with my alternate indexing, then you'd get data clustered by class name, on which the database will have to do a sequential scan to find a particular ID. That's bound to be slower than searching for IDs and then checking the class name.
Another reason, I've found, is that SQL optimizers tend to not figure out to use an index when the first field in the index is not specific enough, apparently.

Related

Rails and Postgresql, confused about building indexes

I started to use pg_search in my rails project and I'm confused about building indexes.
I have a model with two attributes, where I do a full-text search: title and description. Currently, I have this setup:
class Argumentation < ApplicationRecord
include PgSearch
pg_search_scope :searchfor, :against => [:title, :description]
end
This works but it the query is slow. It needs to be faster and my first thought was to add an index.
There is a site dedicated to building indexes for full-text search: https://github.com/Casecommons/pg_search/wiki/Building-indexes
I want to follow the guide, but some things confuse me:
1.The guide tells me, to add a column "tsvector" to the table.
add_column :posts, :tsv_body, :tsvector
But isn't this line of code adding two columns: tsv_body and tsvector? Or is tsv_body a placeholder, for example description or title in my case?
After that, the index should be added to tsv_body like this:
add_index(:posts, :tsv_body, using: 'gin')
Then, the guide talks about triggers. Why are triggers needed? If you add an index to a string attribute, triggers aren't needed (if I'm not mistaken).
In the trigger itself, there is "pg_catalog.english".
create_trigger(compatibility: 1).on(:posts).before(:insert, :update) do
"new.tsv_body := to_tsvector('pg_catalog.english', coalesce(new.body,''));"
end
Since there are many languages in my attributes, not just english, I wonder: Is it even possible to add an index, if there are multiple languages?
0) You should ask separate questions as separate questions ;)
1) add_column :posts, :tsv_body, :tsvector
isn't this line of code adding two columns: tsv_body and tsvector?
No - add_column has several arguments for adding a single column. The first argument is the table-name, the second is the name of the new column, the third is the type of column (eg string, integer, or in this case tsvector)
2) Why are triggers needed?
A trigger is something that watches for something to happen in your database, then does something based on that. In this case, your trigger will probably be doing something like watching to see when one of your columns gets changed... and it will then go ahead and update the index with the changes - this is better than periodically running the rebuild-index over the entire table - which potentially is huge and will take a long time and will be "updating" things that didn't actually change.
3) Is it even possible to add an index, if there are multiple languages?
Yes. I'm not highly familiar with the details - but the fact that you add the language name for the index is indicative that you can have an index for the other languages too.

Why does rails use id as primary key and should I index over other columns?

I understand Ruby on Rails uses id as the primary key attribute for all its tables. I understand it's convenient since the id is always distinct, auto increments, and is easy to index on (I think), but I'm wondering if I should build an index over other attributes, and if that is possible.
I have a web app where users can upload images into portfolios, so granted I have User, Portfolio, and Image, and the Image table has two columns that are:
:user_id
:portfolio_id
So sometimes I may want to pick all images belonging to a certain user or all images in a certain portfolio, or in a range of portfolios, etc'
Should I make Rails build an index over attributes I am searching over frequently? and if so, is there a way to do that? are there any drawbacks to doing that?
I remember reading some time ago there are gems to make rails include other fields as part of the primary key, but what if I just want to build an index over them without including them in primary key?
Yes, making indexes on columns you use for search frequently is a good idea and encouraged.
In your working directory:
rails generate migration add_indexes_to_images
This will generate a migration file, you can then edit the file to add your indexes:
class AddIndexesToImages < ActiveRecord::Migration
def change
add_index :images, :user_id
add_index :images, :portfolio_id
end
end
Read more about migrations and all the things they can do.

What's the correct way to use Thinking Sphinx scope

I am trying to create a TS scope as follows:
include ThinkingSphinx::Scopes
sphinx_scope(:status_approved) {
{:conditions => {:status => "approved"}}
}
default_sphinx_scope :status_approved
My indice file is:
indexes name, status
has user_id, created_at
Two questions:
Does field status needs to be defined as an index in order for the conditions string filter to work? Seems like I get a no field in schema query error if I do not do that.
If it is required to define as part of the indexes, then whenever someone updated the status field, it will not show in the results until the next reindex occurs. Is this the only way to use a string filtering TS scope? Or is there a better way to do this?
I'm using Rails 3.2.16 and TS 3.0.6
Sphinx scopes are for your Sphinx searches, so yes, you'll need status as a field in your index definition if you're using it in a Sphinx scope.
If you want your Sphinx index data up-to-date as quickly as possible, you may want to investigate using delta indices (perhaps using Delayed Job or Sidekiq, to remove it from your web processes). Or, you could use Sphinx's real-time indices instead.
To answer my own questions after further research:
I have search through the documentation but have not found any specific reference to this. I may have missed it. However by trial and errors , I found that if I want to use the string filter method, I have to add the field to the indexes. If not, it will complain that the field is not found in the schema.
There is a 'new' feature introduced to TS called Realtime Indices which apparently addresses this issue. This was mentioned in the author's blog http://freelancing-gods.com/ I have not tried it. In any case, I have taken a different route which did away using the default scope and use the filtering in the indices instead. In my index file, i now have:
indexes name, status
has user_id, created_at
where "status = 'approved'"
And I no longer need to define the default scope in my model as such. This will still require periodic reindexing in any case.

Placing the foreign key in rails model belongs_to association

I am a beginner to rails framework. I have a fundamental question.
I am trying to define some models and their association referring the popular rails guides. My association looks like below.
class Person < ActiveRecord::Base
has_one :head
end
class Head < ActiveRecord::Base
belongs_to :person
end
Here, I need to add the foreign_key (Persons's primary key) in the table 'head'.
Now, if I need to get the 'head' of a 'person', rails need to scan through the head table and match the person_id.
The straight forward way I would think is to add the foreign key in 'person' table. Then I can directly refer the 'head' from 'person' with it's ID.
It appears that rails convention is not performance friendly. Am I missing something here?
When you create the migration to add the column containing the foreign key, it is highly recommended to add an index on this column. This way, the database will efficiently find the Head from the person_id (as efficiently then a search by its id).
add_index :heads, :person_id
If it's a one-to-one association, you can even add the unique option (unless your application accepts conjoined twins :-) ):
add_index :heads, :person_id, :unique => true
I suggest you to have a look to this 2 articles on where to use indexes:
http://tomafro.net/2009/08/using-indexes-in-rails-index-your-associations
http://tomafro.net/2009/08/using-indexes-in-rails-choosing-additional-indexes
Re: It appears that rails convention is not performance friendly. Am I missing something here?
Yes. Rails is designed for production sites. So it includes many performance features. As #Baldrick says, the answer to your specific concern is to always add an index for foreign key fields.
Adding an index for each foreign key field is needed, for performance reasons, for any SQL dbms application, no matter the language. Note that the index is added to the database, not to the MVC layers (Rails).
Rails itself includes additional performance features including sql results caching, optional fragment and page caching, and more.
Rails 3.2 includes the slow query features. These enable Rails to automatically show you the queries which are slow. You can then focus on fixing them as appropriate.

What does a db table created by the Rails framework look like?

I don't have a Rails environment set up and this is actually quite hard to find a quick answer for, so I'll ask the experts.
When Rails creates a table based on your "model" that you have set up, does Rails create a table that mirrors this model exactly, or does it add in more fields to the table to help it work its magic? If so, what other fields does it add and why? Perhaps you could cut and paste the table structure, or simply point me to a doc or tutorial section that addresses this.
If you're building a completely new application, including a new database, then you can build the whole back end with migrations. Running
ruby script/generate model User name:string
produces both a user.rb file for the model and a migration:
class CreateUsers < ActiveRecord::Migration
def self.up
create_table :users do |t|
t.string :name
t.timestamps
end
end
def self.down
drop_table :users
end
end
You can see that by default the generate script adds "timestamps" for (created and last updated) and they're managed automatically if allowed to remain present.
Not visible, but important, is that an extra column, "id", is created to be the single primary key. It's not complusory, though - you can specify your own primary key in the model, which is useful if you're working with a legacy schema. Assuming you retain id as the key, then Rails will use whatever RDBMS-specific features are available for new key values.
In ActiveRecord, models are created from database tables, not the other way around.
You may also want to look into Migrations, which is a way of describing and creating the database from Ruby code. However, the migration is not related to the model; the model is still created at runtime based on the shape of the database.
There are screencasts related to ActiveRecord and Migrations on the Rails site: http://www.rubyonrails.org/screencasts
Here's the official documentation for ActiveRecord. It agrees with Brad. You might have seen either a different access method or a migration (which alters the tables and thus the model)
I have had a little experience moving legacy databases into Rails and accessing Rails databases from outside scripts. That sounds like what you're trying to do. My experience is in Rails databases built on top of MySQL, so your mileage may vary.
The one hidden field is the most obvious --- the "id" field (an integer) that Rails uses as its default primary key. Unless you specify otherwise, each model in Rails has an "id" field that is a unique, incremented integer primary key. This "id" field will appear automatically in any model generated within Rails through a migration, unless you tell Rails not to do so (by specifying a different field to be the primary key). If you work with Rails databases from outside Rails itself, you should be careful about this value.
The "id" field is a key part of the Rails magic because it is used to define Rails' associations. Say you relate two tables together --- Group and Person. The Group model will have an "id" field, and the Person model should have both its own "id" field and a "group_id" field for the relationship. The value in "group_id" will refer back to the unique id of the associated Group. If you have built your models in a way that follows those conventions of Rails, you can take advantage of Rails' associations by saying that the Group model "has_many :people" and that the Person model "belongs_to :group".
Rails migrations also, by default, want to add "created_at" and "updated_at" fields (the so-called "timestamps"), which are datetime fields. By default, these take advantage of the "magic" in the database --- not in Rails itself --- to automatically update whenever a record is created or modified. I don't think these columns will trip you up, because they should be taken care of at the database level, not by any special Rails magic.

Resources