Broken-down-models - a model refactoring library

There have been a series of presentations and discussions on the topic, “what to do when you have a 10K-line Django project”. I would like to suggest, on behalf of Matific, a bit of an answer to some issues in that direction.

Break a large model down, transparently

In a Django project that goes on for several years, models tend to grow and accumulate fields. If you aren’t very disciplined about this, you wake up one day, and find that one of your central tables, one
with millions of rows, has 43 columns, including some TextFields. Most of them are not required most of the time, but the default (and common) use is to fetch all of them; also, since this table is queried
a lot, the mere fact that it has so many columns makes some of the access slower.

When you realize that, you want to break it into components, such that only a few, most-important columns will participate in the large searches, while further details will be searched and fetched only when needed.

But that is a scary proposition – it might involve subtle code changes, break not just field access but also ORM queries… and this is a central model. The change imagined is open-heart surgery on a large
project. Maybe, if we look the other way, it won’t bother us too much…

Broken-Down-Models is here to help you. This is a library which can help you refactor your large model into a set of smaller ones, each with its own database table, while most of your project code remains unchanged.

https://broken-down-models.readthedocs.io/en/stable/


This project was born out of a DBA’s comment that fetching all the fields of a large table was maybe not a great idea. The documentation includes some benchmarks that, IMO, show that it offers some interesting performance trade-offs.

The library makes heavy use (one might even say, abuse) of Multiple Table Inheritance to do the “magic” where a model is taken apart, without requiring changes in code using it (for the most part).

While working on it, we also implemented bulk_create() for MTI models; Django, currently, does not support this. We needed it to be done in a special way because of our unorthodox use of MTI, but I think it suggests that a “normal” implementation may not be too hard.

Thought y’all might find this interesting, and/or have helpful comments before we deploy this in production. Also, there’s already a couple of suitable tasks to do in case you want to get involved.

3 Likes

Very cool! I’ve heard of this technique called “vertical partitioning” by DBA’s. I’ve done it manually sometimes with separate models with one to one field links, but nice to see a tool that makes it easy to refactor that way.

Do you have a Django ticket for the bulk create enhancement? That would be nice to see in core if it’s possible.

2 Likes

Thanks!

There’s a ticket and even a PR for bulk-create on MTI (neither is mine), but it seems work on this has stopped. As might be expected, that work exposes both problems and opportunities which the current code in broken-down-models is not written to handle.

1 Like