On Why id is Faster
November 14, 2011 § 1 Comment
Database performance is always something that has fascinated me. Not on a admin level, but from the point of view of a developer. Even with a stock database setup there are things that a developer can do to help optimize access. If you do any middle or backend web development you will be familiar with running queries against a database. You will also be familiar with how adding indexes on the commonly queried columns of your tables can increase performance. But it turns out, that not all indexes are created the same.
I have been working with the InnoDB storage engine of MySQL in a Ruby on Rails web development for a while now. Over the years I began to noticed something but was never able to figure out what was causing it . When ever I would queries tables using the id column, The queries would return much faster than when I queried against some other indexed column, even if that column was an indexed integer. For those not familiar with Rails conventions, every table is created with an integer column, id, that has the table’s primary index placed on it. It is auto-incrementing so every row has a unique value, a requirement of it being a primary key. Now the InnoDB constructs the primary key in a special way, using a clustered index. What this means is that the primary key index points directly to the page the row data resides on either on disk or in memory. Non-clustered, or secondary indexes, point to the primary key instead of the row data. This little bit of indirection cause the secondary indexes to be slower than a look up on the primary key.
It’s always satisfying figuring out the reasons behind behavior you do not initially understand.
For some more information you can check out the MySQL Reference Manual. This investigation was prompted by an answer to a question I posted on Stackoverflow about partial indexes and the group by operation.