by Ariejan de Vroom
Ruby on Rails: UUID as your ActiveRecord primary key
Sometimes, using the good old ‘auto increment’ from your database just isn’t good enough. If you really require that all your objects have unique ID, even across systems and different databases there’s only one way go: UUID or Universally Unique IDentifier.
A UUID is generated in such a way that every generated UUID in the world is unique. For example: 12f186e6-687e-11ad-843e-001b632783f1. This string is randomly generated based on several factors that guarantee it’s uniqueness.
Anyway, you want to replace the default integer-based primary keys in your model with a UUID. This is quite easy, but there are some caveats.
First off, you should have a column in your database table that holds the UUID. You may be tempted to just change the column definition for id from integer to string and be done with it. But this won’t work as expected. For your development, and maybe even your production system, this may work fine, but you might be in for some unexpected surprises.
The best example of such a surprise is RSpec. RSpec uses ‘rake db:schema:dump’ to create a sql dump to quickly load the database with. However, the ’schema:dump’ does not look at the id column in your database, but instead adds the default primary key definition from the ActiveRecord adapter.
The solution is to disable the id column and create a primary key column named uuid instead.
create_table :posts, :id => false do |t| t.string :uuid, :limit => 36, :primary => true end
In your Post model you should then set the name of this new primary key column.
class Post < ActiveRecord::Base set_primary_key "uuid" end
The next step is to create the UUID itself. We’ll have to do this the Rails app, because most databases don’t support UUID out of the box.
First install the uuidtools gem
sudo gem install uuidtools
Create a file like lib/uuid_helper.rb and add the following content.
require 'rubygems' require 'uuidtools' module UUIDHelper def before_create() self.uuid = UUID.timestamp_create().to_s end end
Then, include this module in all UUID-enabled models, like Post in this example.
class Post < ActiveRecord::Base set_primary_key "uuid" include UUIDHelper end
Now, when you save a new Post object, the uuid field is automatically filled with a Universally Unique Identifier. What else could you wish for?
Besides the obvious benefit from being unique, UUID’s are also a good way to obfuscate linear (and thus predictable) auto increment sequences.
The uniqueness of UUID’s is *practically* guaranteed. If you check http://en.wikipedia.org/wiki/UUID#Random_UUID_probability_of_duplicates you’ll see that the probability of two UUID’s being the same is very slim.
If you generate 70,368,744,177,664 UUID’s a year (yes, 70 trillion) the changes of hitting one duplicate are 0.0000000004 x 10-10. To quote Wikipedia: “after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%.”
Ariejan
Hey, thanks for the post. I’m working on a distributed service that uses Rails and requires uuids. I decided not to go the route that you did (setting the primary key to UUID) because of what I had read about Innodb performance being integer ID primary key dependent. See http://kccoder.com/mysql/uuid-vs-int-insert-performance/ and http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/
So instead I’m using UUID as just a normal column rather than a primary key, and doing find_by_uuid where necessary. This has some drawbacks of course, such as some of the ActiveRecord associations methods such as create not working via the UUID, and having to use find_by_uuid instead of just find. ActiveResource also needs some arm twisting to work with UUIDs.
I’m interested on your perspective on this. Are there other DBs (PostGres?) that behave better using UUIDs in the way that you recommend?
Thanks much!
logan
@logan: I also thought about this. I ran a few benchmarks and with all the ‘default’ values (MySQL, innoDB), there’s not really a noticeable difference in performance between integer-based ids and uuids. I ran tests with tables containing up to 100k rows.
It’s very likely that with bigger tables you’ll see a performance decrease. So, it all really depends on how much data you want to store.
For me, uuids are used to obfuscate the id to the end user. This makes guessing for example a user id less likely. For that reason I’m going to stick with the solution you propose. Just keep using integer-based ids, and a special ’secret’ column for usage in URLs (did you thinks of overriding the to_param method on your AR class?).
If you system contains < 100k rows, there shouldn’t be much of a problem, just make sure the uuid column is indexed properly.
Your question about DBs that support UUID out of the box is quite a valid one. I know that the (old) Lotus Notes databases only used UUID’s as sequences. Postgresql now has a ‘uuid’ type column which can be used for exactly this. It’s supposed to work quite well, but I’m not sure the ActiveRecord adapter supports it yet. Maybe this is worth a look ;-)















Interesting, though as you say more useful for sharing across systems without having to deal with conflicts.
Having said that, even though UUID’s are supposed to be universally unique, i don’t buy that there is no possibility whatsoever that two seperate systems will generate the same id.
Though then again, if you control the systems then i guess that is not so much of an issue. :)
On another note, would be interesting seeing any perfomance hits one might encounter from having a 36 character string as an ID.