order_by doesn't seem to order by UTF-8

I’m using a Postgres database and I’m having issues sorting the columns by UTF-8.
It looks like it sorts correctly from A-Z but in the database I’ve got a name starting with the letter Å, and this is comes first in the returned query set.

I’m not quite sure how to fix this. Anyone know what I might be missing in my settings?

Admittedly, I’ve never had to deal with this type of situation - I don’t even know how it should be sorted. But…

Postgres gives you the ability to apply a collation to a database. See the lc_collate option for CREATE DATABASE. Also see Locale Support, and Collation Support.

I tried creating a database and table with applying the “C” collation, and I can see where it does (on my system) sort Å after Z, so something is happening there.

Ken

Thanks for the quick response @KenWhitesell!

Firstly is should be sorted A-Z then Æ,Ø and Å.
I’m sort of a beginner in this world, so how do I change the collation of my existing postgres database?
I tried figuring out if it would be possible to change the settings within my settings.py (Django settings), but that didn’t seem to work.

With possibly a few exceptions, the ORM works by translating querysets into SQL that gets submitted to the database. So an order_by clause gets converted to an ORDER BY clause within that SQL - which is processed by the database.

Anyway, that long-winded statement is my attempt to explain why there wouldn’t be a setting in Django to do this - it’s really out of Django’s hands.

To answer your other question, no, I’m not aware of any “live” mechanism to change lc_collate. I believe that when your database is created, that collation specification is wired into your database at a low level - at a minimum, it’s going to affect how indexes are created and managed, along with the possibility of several other internal structures as well.

Since I’m not familiar with the different collation options and how they affect sorting, my approach would be to use pg_dump to get a copy of my database in an “insertable” format. Then I would create a test database with a particular setting and import my data, followed by running a couple queries to see if I got my desired results.
If it’s not right, drop the database and try with a different setting.

I’m sure there are plenty of people around who would have a more direct answer, I’m just not one of them. You might find some more specific and directed assistance in a Postgresql-related forum or mailing list.

(Note, setting lc_collate to C on my system sorts them as A-Z, Å, Æ, and then Ø.)

I’m currently looking around for a way to have Ä appear the correct way in German. Someone is telling me that it should come right after A and not at the end.

Currently, neither SQLite nor Postgres is doing what I want for German.

So I got interested in whether Django could potentially add the locale to the query. But it cannot really do that in a smart way. SQLite seems to need to be compiled with some ICU support, and Postgres seems to need whole databases initialized with a special locale. Or that was my findings after a bit of trying.

So currently, I don’t think that there is a simple database-agnostic way of ordering querysets by the currently active locale. But I’d be happy to be proven wrong.

The Django ORM supports ordering on expressions. You can use the Collate() database function to order using a collation of your choice.

This is how to set the default ordering to use the German collation:

from django.db import models
from django.db.models.function import Collate

class Author:
    name = models.CharField("Name"), max_length=255, db_index=True)

    class Meta:
        ordering = [Collate("name", "de-x-icu")]
>>> Author.objects.all()
<QuerySet [<Author: Otto>, <Author: Ötto>, <Author: Veronika]>
1 Like