To make sure no one jumps the gun on trying to “solve” this problem I’ll start this thread by stating that I don’t expect any actions to be taken as part of this discussion but I figured it was worth sharing this finding nonetheless to document the state of things
As some of you may know ORM transforms allow for specialized lookups to be registered against them. This can be quite handy when you want to define specialized lookups that override the ones that are registered on your transform’s output_field.
We do make use of this lookup specialization pattern within core with the ExtractYear transform which powers the date(time)?_field__year lookup by having __exact, __gte, and friends result in SQL BETWEEN lookups to allow users to take advantage of a potential b-tree index on the left-hand-side of the lookup.
If the above statement sounds cryptic to you what I mean is that doing something along the lines of
class Article(models.Model):
published_at = models.DateField()
Article.objects.filter(published_at__year=2027)
results in
WHERE published_at BETWEEN '2027-01-01' AND '2027-12-31'
and not
WHERE EXTRACT(YEAR FROM "published_at") = 2027
which is beneficial if you have an index covering published_at at the first version allows for it to be used while the second doesn’t.
I think this is a good default and documented I wanted to bring attention that the relatively recent addition of support for functional indexes might encourage users to index the __year transform directly expecting it, understandably, to cover lookups against it.
In other words, defining an index on published_at__year like so
class Article(models.Model):
published_at = models.DateField()
class Meta:
indexes = [
models.Index(
"published_at__year",
name="publication_year_idx",
)
]
Would not cover accesses of the form published_at__year=2027 as the later is designed to take advantage of a potential index on published_at.
Given the ORM allows for lookups and transforms to be defined on a per-field instance basis there is an escape hatch here (Article._meta.get_field("published_at").register_lookup) but it can be hard to identify.
I’m bringing this up as an example because I suspect the interest for adding UUIDv7 support to the ORM with likely result in considering the addition of uuid_field__timestamp transform and I think that implementing it via uuid_extract_timestamp (and other vendor equivalent) will bring a similar debate as to whether we should implement __exact, __lte and friends the same way we did with ExtractYear by having uuid_field__timestamp__exact generate SQL of the form
WHERE uuid_field BETWEEN (
{seconds}-{ms}-7000-0000-000000000000
AND {seconds}-{ms}-7ffff-ffff-ffffffffffff
)
where timestamp_random_0 is the UUID for the specified timestamp with 0 filled for random bytes and timestamp_random_max is the analogous with F filled for random bytes.
instead of the naive
WHERE uuid_extract_timestamp(uuid_field) = ?timestamp
Given UUIDv7 are likely picked for their indexability I would argue that even if the __transform gotcha exemplified with __year exists I do believe that we should consider specializing the lookups of a potential UUIDTimestamp(Transform) like we did with ExtractYear.