How can I securely encrypt spatial fields (GeoDjango / PostGIS) in Django?

I’m working on a Django project with GeoDjango models that store user location data (e.g., PointField, LineStringField). Because location data is highly sensitive, I want to ensure it’s secured (?encrypted) at rest in the database.

The challenge is that most Django field encryption libraries (like django-cryptography) work well for standard CharField or TextField, but don’t appear to support spatial fields directly

Ideally, I’d like to keep using Django’s model field API so that saving/retrieving encrypted geometries feels natural.

Has anyone implemented a secure way to encrypt GeoDjango fields?

The problem with this question is your use of the phrase “at rest”, as that typically means “when the database server isn’t running” rather than “when the data is not currently in transit”.

Encrypting a database at rest isn’t all that difficult really. If you’re using a cloud provider like AWS RDS for example, this often comes for free by checking a box in the console. When the server is not running, the database is stored on an encrypted disk and is therefore “encrypted at rest”. If you’re self-hosting, then it’s just a process of using an encrypted partition for your database storage. If someone breaks into your rack and steals the hard drive, the data is inaccessible.

However I get the feeling that you’re using “at rest” to refer the other definition: you want the data to be accessible only to those with the key(s) when the server is running. An example of this might be other PII information like a passport number in the database. The user might provide the key that your application uses to decrypt this before doing something with that data, or your application may function without having to know that data and it’s instead pulled out of the database in its encrypted for and passed to the client which does the decryption on-device. You see this in systems like Bitwarden for example, where the server doesn’t want to know what’s in there. In these cases, the client is doing all the encryption/decryption and the server is just storing encrypted blobs. If the server is ever compromised, even when it’s running, data loss isn’t a problem ‘cause the data’s all encrypted.

The trouble is with querying. It’s easy enough to store a blob for a user, but if you want to query for something like “the number of points within 100m of this other point”, your database needs to know what that data is and therefore it must be unencrypted to the running data server. There’s not really a way around that unless you’re doing something like pgcrypto, but I don’t think that supports GIS fields… you may want to look into it yourself if you’re curious.

I suppose if you were really determined, you might use pgcrypto to encrypt the text in a PointField for example and then use these crypto functions to decrypt the values on-the-fly. To do this in Djangoland, you’d probably want to subclass the PointField class and apply the above to in .contribute_to_class() (I think). This would allow you to continue to do GIS-based queries, albeit slowly, as it’d have to decrypt every field before doing anything.

The other problem of course if that the decryption key would have to be passed to the query manager somehow, and unless you just have one key for the whole server, you’d be storing the same data multiple times, once for each key that’s permitted to decrypt it. If you have a single key for the whole server, that’s not much different from not encrypting the field data at all, since compromising the running server would necessarily leak the key.

Given the above, you can imagine why this isn’t something people usually do. It can make sense for data that’s only needed by the user, but if you’re going to store it so it can be queried later (typical for GIS data), then encryption is usually limited to at rest (when the server isn’t running) and in transit (sending the query result over the network from the db to Django and from Django to the client).

So, if you’re storing this data only to be made accessible to the user and not be queryable by the server, then you don’t actually need GeoDjango. Just store the field in a string and encrypt it client-side. That would eliminate your risk entirely. If however you need to query against this data, then you may have to settle to having it unencrypted when the server is running and instead consider other useful tricks, like storing only what is expressly needed by your application and/or fuzzing the data (reducing resolution, moving data points by small but random values) to obfuscate the exact position if indeed knowing the exact position is unnecessary.

1 Like

Hi there thanks for such a detailed response. It is largely the same conclusion that I had come to and I think my question had originated from a slight misunderstanding of the concept of ‘encrypted at rest’ and the very fact that my cloud database is actually encrypted at rest. I was essentially concerned that I could see location data in my admin site (currently no active users, just test data) but of course those are just fields being queried by the ORM and so that doesn’t mean the data isn’t encrypted at rest per se.

My plan is to limit the exposure of location fields in the admin site, sample the data so it’s less granular and continue with the set up with postGIS as I have currently. I did play around with adjusting Linestring fields to be array fields and encrypting these using django-cryptography, but it became messy very quickly! I also believe there is an open issue on django-cryptography for spatial field encryption, so will be interesting to see if that goes anywhere…although I would imagine you would lose all of postGIS’ capacity for native querying as soon as you did that so probably limited in scope