I have seen people discouraging the use of UUIDs as primary keys for the database. I understand that it affects performance because of the randomness of the UUID.
I was wondering what to use as an alternative method? If I am using UUID on the user interface, then I have to query the database by the UUID anyway, even if I am using an auto-increment integer as a primary key. Isn’t that affecting the performance the same way?
Let’s assume a model named Book with the default .
id = models.AutoField(primary_key=True)
name = CharField(...)
reference = UUIDField(...)
The user should have a UUID that is associated with a book that she/he owns. So the UI will always present the
Book.reference (UUID) of the book. Let’s assume that I have a form that allows editing the name of the book at
/books/<reference:UUID>/edit. When the user submits the form, a query must be done to the database based on the
reference, that is the UUID.
Therefore my question is, how to use the more efficient
AutoField as private key, if what I am using to query the database is the UUID.
Thank you for reading.
If you want to do some reading on this topic, I found this post that aggregates some other posts as well as providing information of it’s own: UUIDs are Popular, but Bad for Performance — Let’s Discuss - Percona Database Performance Blog
It appears that the performance issues primarily concern doing inserts in the index for UUIDs. Since the UUIDs are (effectively) randomly ordered, your database is going to experience a lot more overhead in adding a new UUID rather than just adding the new sequential key on at the end.
What you’d want to consider is whether or not this difference is going to matter depending upon the volume of transactions occurring on the table using UUIDs as an indexed field. My guess is you could probably get a feel for the magnitude of this effect by running a fairly simple test.
Thank you for the reply and the reference.
I have seen that article. I am not questioning the performance hit on the database; I accept the research of the experts.
Taking this as a given, my question is how do it differently, if UUID is used on the user interface, as the example I gave on the first post of this thread.
I’m not following then what you’re asking.
If you’re creating an index on a UUID, you’re incurring the performance hit at the time the row is being inserted. There’s no effect on queries.
It was not clear to me that only the write is affected, and not the read. Thank you.