Tags:
Programming
Ruby on Rails
Here’s another article about performance in Ruby on Rails apps. This one has information that is brutally obvious to anyone with database experience, but if you’re more of a web developer than a database admin, then this article may be insightful for you. Check out my first article, Using :select in Rails for Better Performance, for other tips.
I have looked at a lot of other people’s Rails code as I was learning, including their database migrations, and have rarely seen indexes being created with the tables. For those who aren’t using database migrations, take the time to learn about them and you’ll never go back to writing CREATE TABLE statements yourself again! Seriously, migrations might be my favorite feature in Rails! Anyway, a table without an index is not much of a table at all. Querying a table without an index is a performance problem waiting to happen. What’s An Index? In a database, you have a few different kinds of objects. You’ve got tables, which are just big containers in which you put your data. The data is not sorted, so if you want to find something in a table, you need to look through everything until you find what you want. A Cheeky Example Here’s the best example: a phone book. The phone book contains a lot of data, but we know how to find phone numbers very quickly because there is an index on the phone numbers. The phone book is indexed by last name, first name, and address, in that order. So when you want to find Cheeky McMonkey’s phone number, you flip right to the part of the phone book that has last names starting with M, then find Mc, and so on. Now imagine if the phone book did not have an index. How would you find Cheeky McMonkey’s phone number? You would just start on the first page and look at all the names until you found Cheeky, one row at a time, one page at a time. A few days later, you might have found his phone number. Her phone number? Cheeky? No, it’s a him. His phone number. I should call him more often. Cheeky in Rails Let’s do the Rails database migration for a phone book with its index. We’ll create it with this command: ruby script/generate migration PhoneBook And here’s what the database migration could look like: class PhoneBook < ActiveRecord::Migration Note that the order of the column names in the add_index statement is very important! This index will be used to find rows in the phone_book table when you do PhoneBook.find and your :conditions option specifies the last_name option (and possibly more). The last_name column is the most important column in the index. If a query (find) does not include last_name in the :conditions, then the index will not be used! The entire table will be searched instead. The index is sorted, so the database can look at it just like we look at a phone book. It flips to the correct pages of the index just like we do when we flip open a phone book. When it finds the terms that match your :conditions, it can then look up the data (the phone_number) in the phone_book table. Here are some find calls that will use the index and perform very quickly no matter how many millions of rows are in the phone_book table: PhoneBook.find(:all, Even if your :conditions don’t include the entire column list of the index, the database can still use the index. If you limit your search with last_name, then the index can be used. Here are queries that cannot use the index, and will perform very poorly: PhoneBook.find(:all, Going back to our phone book analogy, it’s easy to understand why these queries would perform horribly. If you knew a phone number but wanted to know who it belonged to, the phone book won’t be able to help you very well unless you have a lot of time on your hands. One Index Or Two? So, what if you wanted to do reverse look-ups like that? If your web site needs to find people’s names based on their phone numbers, then you can just create another index on the table: add_index :phone_book, :phone_number Done. You can create as many indexes as you need on a table. It’s like having multiple copies of the same phone book, but each of them sorted differently. Note however that there is no advantage in having indexes with redundant column lists. For example, these indexes are redundant: add_index :phone_book, [:last_name, :first_name, :address] Only the first index is needed. The last two are subsets of the first index, so they serve no purpose. The database might give you an error or warning if you try to create them, I’m not sure. It depends on which database product you are using. What Kind of “Good Performance” Does This Give Me? By this point, you should imagine if you had multiple copies of the same phone book in your house. It’s great to have phone numbers indexed in different ways, who wouldn't enjoy that? But they sure take up a lot of space on the coffee table. And when phone numbers get added, changed, and removed, you’ve got to replace all your phone books! This is true in a database too. Having indexes improves query performance, but can hurt other kinds of database operations: inserts, updates, and deletes. If your app changes the data in your database frequently, then you need to consider the performance penalties of indexes on those operations. When table data changes, all the indexes need to be updated. You’ll have to find a balance between query performance and transaction performance. Some testing and measurements may be needed. Once you create a fifth or sixth index on a table, you should really stop and think about it. What kinds of :conditions are you using in your queries? Can they be changed to use existing indexes? If your database is almost entirely used for queries, not transactions, then have no fear. Indexes are what you want. In practice, indexes will have negligible impact on create, read, and delete operation performance. But I would be remiss if I didn't mention this point. How Do I Know Which Indexes I Need? To figure out which indexes you should create, I suggest you look at all the find calls that you make in your Rails code. The :conditions you use tell you which indexes you need. Here’s an example of two separate queries: @article_list = Article.find(:all, Both of those queries will benefit from an index on created_at (first) and comments_count (second): add_index :articles, [:created_at, :comments_count] It’s that easy! This is why developers and database admins need to talk to each other. Database admins won't know which indexes to create if they don't know what queries the developers are using. Final Cheeky Thoughts If your Rails app is suffering from performance problems, the solution might be as easy as adding a missing index. With the right index, any query should run in a fraction of a second. If you measure your database performance in seconds rather than milliseconds, then it might be time to look at all your queries and indexes to see if you’re missing something. Also, the on-disk size of an index depends on the columns of that index. So, if you’ve got a string column that has a length measured in hundreds or thousands, than you should probably not include it in an index. Your VARCHAR(32000) columns are not the best candidates for indexes. Use them only if you have a very good reason. Well, that was a long one. I hope this helps some Rails folks develop the fastest web sites on the Internet! Please let me know if you have any questions or additions for this article. If you found this post useful, why not bookmark my Ruby On Rails content listing? That page will be updated whenever I tag an article with "Ruby on Rails". I've got another article about database transactions in mind, which is another infrequently used feature of Rails that I think everyone should be using. Anyway, don't get me started. I'll stop now. |
About Me
![]()
![]() ![]() |
Why are you saying that this query:
PhoneBook.find(:all, :conditions => [’first_name = ? AND address = ?’,
and this
PhoneBook.find(:all, :conditions => [’address = ?’, ‘421 Monkey Banana Crescent’])
will perform very poorly? If the address is indexed even it's the third element it shouldn't be so slow right?
Btw great article, like you said a table without indexes is incomplete.
Thanks,
Peter.
Thanks for the question. The order of the columns in an index is significant. If the first (left-most) column of the index is not in your :conditions, then the index cannot be used. The index is primarily sorted on the first column of the index (for example, the last name of everyone in the phone book). All other columns of the index are used only if necessary.
In those two queries, the index can't be used at all. This index would help those two queries:
(address, first_name)
I hope this helps! Here's a website that explains indexes in more detail: http://www.websitedatabases.com/database-index.html
Post a comment: