12 Aug 2008, 11:30pm

by Ariejan de Vroom

Ruby on Rails: UUID as your ActiveRecord primary key

Sometimes, using the good old ‘auto increment’ from your database just isn’t good enough. If you really require that all your objects have unique ID, even across systems and different databases there’s only one way go: UUID or Universally Unique IDentifier.

A UUID is generated in such a way that every generated UUID in the world is unique. For example: 12f186e6-687e-11ad-843e-001b632783f1. This string is randomly generated based on several factors that guarantee it’s uniqueness.

Anyway, you want to replace the default integer-based primary keys in your model with a UUID. This is quite easy, but there are some caveats.

First off, you should have a column in your database table that holds the UUID. You may be tempted to just change the column definition for id from integer to string and be done with it. But this won’t work as expected. For your development, and maybe even your production system, this may work fine, but you might be in for some unexpected surprises.

The best example of such a surprise is RSpec. RSpec uses ‘rake db:schema:dump’ to create a sql dump to quickly load the database with. However, the ’schema:dump’ does not look at the id column in your database, but instead adds the default primary key definition from the ActiveRecord adapter.

The solution is to disable the id column and create a primary key column named uuid instead.

create_table :posts, :id => false do |t|
  t.string :uuid, :limit => 36, :primary => true
end

In your Post model you should then set the name of this new primary key column.

class Post < ActiveRecord::Base
  set_primary_key "uuid"
end

The next step is to create the UUID itself. We’ll have to do this the Rails app, because most databases don’t support UUID out of the box.

First install the uuidtools gem

sudo gem install uuidtools

Create a file like lib/uuid_helper.rb and add the following content.

require 'rubygems'
require 'uuidtools'
 
module UUIDHelper
  def before_create()
    self.uuid = UUID.timestamp_create().to_s
  end
end

Then, include this module in all UUID-enabled models, like Post in this example.

class Post < ActiveRecord::Base
  set_primary_key "uuid"
  include UUIDHelper
end

Now, when you save a new Post object, the uuid field is automatically filled with a Universally Unique Identifier. What else could you wish for?

Please share the love of this post by bookmarking it, and sharing it with others. Thanks!

  • Digg
  • del.icio.us
  • description
  • Reddit
  • Technorati
  • BlinkList
  • E-mail this story to a friend!
  • Facebook
  • Live
  • MisterWong
  • Netvouz
  • NewsVine
  • Slashdot
  • SphereIt
13 Aug 2008, 6:50pm
by James Urquhart


Interesting, though as you say more useful for sharing across systems without having to deal with conflicts.

Having said that, even though UUID’s are supposed to be universally unique, i don’t buy that there is no possibility whatsoever that two seperate systems will generate the same id.

Though then again, if you control the systems then i guess that is not so much of an issue. :)

On another note, would be interesting seeing any perfomance hits one might encounter from having a 36 character string as an ID.

14 Aug 2008, 8:07am
by Ariejan de Vroom


Besides the obvious benefit from being unique, UUID’s are also a good way to obfuscate linear (and thus predictable) auto increment sequences.

The uniqueness of UUID’s is *practically* guaranteed. If you check http://en.wikipedia.org/wiki/UUID#Random_UUID_probability_of_duplicates you’ll see that the probability of two UUID’s being the same is very slim.

If you generate 70,368,744,177,664 UUID’s a year (yes, 70 trillion) the changes of hitting one duplicate are 0.0000000004 x 10-10. To quote Wikipedia: “after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%.”

15 Aug 2008, 1:23am
by logan


Ariejan

Hey, thanks for the post. I’m working on a distributed service that uses Rails and requires uuids. I decided not to go the route that you did (setting the primary key to UUID) because of what I had read about Innodb performance being integer ID primary key dependent. See http://kccoder.com/mysql/uuid-vs-int-insert-performance/ and http://www.mysqlperformanceblog.com/2007/03/13/to-uuid-or-not-to-uuid/

So instead I’m using UUID as just a normal column rather than a primary key, and doing find_by_uuid where necessary. This has some drawbacks of course, such as some of the ActiveRecord associations methods such as create not working via the UUID, and having to use find_by_uuid instead of just find. ActiveResource also needs some arm twisting to work with UUIDs.

I’m interested on your perspective on this. Are there other DBs (PostGres?) that behave better using UUIDs in the way that you recommend?

Thanks much!

logan

15 Aug 2008, 8:44am
by Ariejan de Vroom


@logan: I also thought about this. I ran a few benchmarks and with all the ‘default’ values (MySQL, innoDB), there’s not really a noticeable difference in performance between integer-based ids and uuids. I ran tests with tables containing up to 100k rows.

It’s very likely that with bigger tables you’ll see a performance decrease. So, it all really depends on how much data you want to store.

For me, uuids are used to obfuscate the id to the end user. This makes guessing for example a user id less likely. For that reason I’m going to stick with the solution you propose. Just keep using integer-based ids, and a special ’secret’ column for usage in URLs (did you thinks of overriding the to_param method on your AR class?).

If you system contains < 100k rows, there shouldn’t be much of a problem, just make sure the uuid column is indexed properly.

Your question about DBs that support UUID out of the box is quite a valid one. I know that the (old) Lotus Notes databases only used UUID’s as sequences. Postgresql now has a ‘uuid’ type column which can be used for exactly this. It’s supposed to work quite well, but I’m not sure the ActiveRecord adapter supports it yet. Maybe this is worth a look ;-)

*name

*e-mail

web site

leave a comment