Rails Indexing has_many_through, how to do it? - ruby-on-rails

My app will have quite large join tables. From googling it seems that indexing is the answer to my prayers, though I haven't really found anything that explains HOW to technically do it for rails.
For simplicity I'm using Parent, ParenChild (join) and Child. Now the ParentChild table is the one that is growing quite big, searching through the whole of ParentChild is starting to take some time.
Which of my migration files do I need to edit in order to get the indexing done, how to do it for this example? AND is editing the migration file the only thing I'll have to do?

You use add_index from any migration you want. So, for example, if you want to add an index on both parent_id and child_id, you'd do this:
add_index :parent_child, [:parent_id, :child_id]
Which columns you index and whether they're in the same index or separate ones depend on your queries.

Related

Does normal indexing also work by creating unique index?

Hi Guys, i would like to know, if i create unique index of two columns on postgreSQL, does normal indexing for the both columns also work by same unique index or i have to create one unique index and two more index for both columns as shown in the code? I want to create unique index of talent_id, job_id, also both columns should separately indexed. I read many resources but does not get appropriate answer.
add_index :talent_actions, [:talent_id, :job_id], unique: true
Does above code also handles below indexing also or i have to add below indexing separately?
add_index :talent_actions, :talent_id
add_index :talent_actions, :job_id
Thank you.
An index is an object in the database, which can be used to look up data faster, if the query planner decides it will be appropriate. So the trivial answer to your question is "no", creating one index will not result in the same structures in the database as creating three different indexes.
I think what you actually want to know is this:
Do I need all three indexes, or will the unique index already optimise all queries?
This, as with any database optimisation, depends on the queries you run, and the data you have.
Here are some considerations:
The order of columns in a multi-column index matters. If you have an index of people sorted by surname then first name, then you can use it to search for everybody with the same surname; but you probably can't use it to search for somebody when you only know their first name.
Data distribution matters. If everyone in your list has the surnames "Smith" and "Jones", then you can use a surname-first index to search for a first name fairly easily (just look up under Jones, then under Smith).
Index size matters. The fewer columns an index has, the more of it fits in memory at once, so the faster it will be to use.
Often, there are multiple indexes the query planner could use, and its job is to estimate the cost of the above factors for the query you've written.
Usually, it doesn't hurt to create multiple indexes which you think might help, but it does use up disk space, and occasionally can cause the query planner to pick a worse plan. So the best approach is always to populate a database with some real data, and look at the query plans for some real queries.

Rails: Foreign Keys and when to index

Three questions:
When you create a model, is a foreign_key automatically created as well?
I'm thinking I should add_index when a column is unique to the table or unique in general, or when the column will relate to other databases, Amirite?
What will an index look like? Will it just use the contents of the cell?
1) Do you mean when using a generator? Generally speaking, you should generate migrations, rather than use a generator for the whole model/scaffolding. And then, no, a foreign key is not automatically created, only if you specify it.
2) add_index is going to come in handy for columns on big tables that need to be accessed quickly by your database. Let's say you've got a users table with an email column that must be unique, but isn't indexed. And your service grows, now you have millions of users, and you need to go User.find_by_email "someone#example.com". Without an index, that's going to take you a while. With an index, it'll be quick. That's when an index comes in handy.
3) Really depends on your database engine afaik. Not something that will affect your day-to-day imho (though if you have a specific database engine in mind, you can certainly find out). Here's the info on MySQL, straight from the source: https://dev.mysql.com/doc/refman/5.5/en/column-indexes.html

Rails migrations indexing

What is the difference between these two methods of adding indexes:
add_index :juice_ingredients, %i(juice_id ingredient_id)
and:
add_index :juice_ingredients, :juice_id
add_index :juice_ingredients, :ingredient_id
Moreover, do I need to explicitly create join table or just add_reference is enough?
The first will create a single index on two columns. The second will create two indexes, each on their own column. Which is better depends on your application, the database, and the queries you run. To figure out which you need to read up on "query optimization" for your database.
The difference is that the first statement creates a multi-column index (also called composite index), the second creates two single-column indexes.
Both versions results in the columns :juice_id and :ingredient_id being indexed at database level. However, they behave a little bit differently.
In order to better understand how, you need to have some knowledge of what is a database index and what you can use it for.
Composite indexes are recommended if you are likely to query the database using both columns in the same query. Moreover, you can use an index with an unique constraint to make sure that no duplicate records are created for a specific key, or combination of key.
Here's some additional articles:
http://www.postgresql.org/docs/8.2/static/indexes-multicolumn.html
https://devcenter.heroku.com/articles/postgresql-indexes#multi-column-indexes

What's add_index used for in this code and more generally?

I am trying to get my head around this code... it's from the Rails Tutorial Book and is part of the process of making a twitter like application.
class CreateRelationships < ActiveRecord::Migration
def change
create_table :relationships do |t|
t.integer :follower_id
t.integer :followed_id
t.timestamps
end
add_index :relationships, :follower_id
add_index :relationships, :followed_id
add_index :relationships, [:follower_id, :followed_id], unique: true
end
end
Since there are only 2 columns (follower_id and followed_id), why
would their be a need for an index?
Does the index sort them in some way? It just seems a bit strange to
me to add an index to a table with 2 columns.
What does the index do to the rows?
Is indexing optional? If so why/why not use it? Is it a good idea to use it in the code above?
Please answer all the questions if you can. I'm just trying to get my head around this concept and after reading about it I have these questions.
Since there are only 2 columns (follower_id and followed_id) why would
there be a need for an index?
Need for indexing doesn't depend on the number of columns. It's used for speeding up the lookups. Even if your table has only one column, verifying whether a particular value is present in that column will need you to scan the whole table in the worst case. With an index it can be answered immediately.
Does the index sort them in some way? It just seems a bit strange to
me to add an index to a table with 2 columns?
No, in general indexes don't sort the data in the table in any way. I say "in general" because clustered indices do sort the data. See this question for more details.
What does the index do to the rows?
Again, nothing in general. Different DBMSes use different mechanisms to associate a row in the table to the index. Indexing is one of the most important tasks in a DBA's work. It'd be great if you have basic ideas about it. Read the wikipedia article to get the basics.
Is indexing optional? If so why/why not use it?
Yes, indexing is optional. You should use indexes when you see your query performance go down. Again, different DBMSes provide different mechanisms for you to monitor your query performance and you should have monitoring in place to alert you when performance degrades beyond a threshold. With experience, you'll reach a point where most of the indexing needs of an application will be clear to you from the beginning.
Is it a good idea to use it in the code above?
Can't comment on that. Indexing needs of each application are different. You should be aware of downsides of over-indexing as well. If you have a lot of indexes, your updates, inserts and deletes will become slower with time since they will also need to update your indexes.
An index on a column or set of columns speeds lookups on that column or set of columns. It's usefulness has nothing to do with the number of columns in the table since it's purpose is to locate the row(s) associated with the column values.
No, the index doesn't sort the table.
The index doesn't "do" anything to the rows, although if it's a "unique" index it would prevent the creation/update of rows which duplicate the column(s) in question.
Indexing is optional. It speeds up lookups, but takes additional time for write operations. Whether or not it is a good idea depends entirely on the application.

Improve performance of Rails application by adding indexes?

In my Rails application I have an index view that lists all of my projects.
This list can be sorted by clicking on any of the table column headers, e.g. Date, Name, updated_at etc. This happens by appending a &sort= GET parameter to the URL.
My question is: From a performance point-of-view, would it be advisable to add indexes to these columns in my database?
This is what a migration might look like:
class AddMoreIndexes < ActiveRecord::Migration
def change
add_index :projects, :date
add_index :projects, :name
add_index :projects, :update_at
end
end
Will I get any performance gains from this?
Indexes can be used to speed an order-by, but if you were identifying a subset of rows to display then an index that is helpful for that is likely to be chosen in preference. You'd need composite indexes in such a situation.
There're a couple of other problems.
Firstly, ordering on an indexed string value may require a linguistically sorted index, not the regular ASCII/Binary sort, so multilingual applications may not be helped at all.
Secondly, it can discourage normalisation of the database because you really need the display values to be in the table you're selecting.
You might like to look at using another method for the sort. I've been very happy with using Google visualisation tables, which come with JQuery sorting built in.
Depending on how you query your database, then yes, it will give you performance gains. For example, whenever I add a foreign key to a table, I immediately index by it. Why? I know queries will be running through it in my application. If not, I wouldn't have put a foreign key. In this way, especially when you accumulate a large amount of data in your database, it will definitely give performance gains (sometimes, by an incredible amount). If you plan to query your database by date, name, or updated at, then yes, it could potentially be a performance gain depending on your query. Otherwise, there really is no point.
Note, you wouldn't want to add an index for every column. Having necessary indices will help you, but if you have an index for every column, then you run the risk of confusing the SQL Query Optimizer and actually hindering your performance.
My suggestion: Add an index for every foreign key you have in your table, but if you're also running some heavy queries with other columns, then add an index there too.

Resources