Structuring large/complex Django projects, and using a services layer in Django projects

Hi all,

This is a long post so I’ve copied my two questions for a TL;DR. I’ve also bolded the questions within my post so you can grasp for any context if you think that it’d help your answers.

  • is there any publicly available (paid is ok) literature or training available that covers how to best structure a large, ongoing, mature Django project?

  • Does anyone have any experience writing complex Django applications with a services layer as discussed in the linked style guide? How has it gone for you? Has it saved more time than it’s spent having to override or reimplement Django’s provided batteries?

I am the sole developer of a Django-based web application. This application was developed by a small team (including myself) in a digital consultancy, so it is quite large. I have since moved on from that role and now work with the client full-time as their sole FT developer. I contract out as need be, however, so far I have done the vast majority of the development work.

If any of you have worked for a hourly-billing consultancy, or have seen code written by one, you will understand the potential shortcomings. The codebase is pretty good and does very well, but a lot of the work was done ad-hoc and crucial things like tests have been largely ignored. The structure of the codebase may also be in need of some TLC with the hindsight of how big the project has become.

At least in the literature I read, not much is said about how to structure these larger projects. People talking about deliberately sterile ‘todo’ applications is largely unhelpful, as my application faces problems that are either deliberately sidestepped by malleable defined requirements, or not dealt with as a result of the (lack of) complexity of toy applications.

So my first question is, is there any publicly available (paid is ok) literature or training available that covers how to best structure a large, ongoing, mature Django project? This project has moved far beyond a collection of generic CRUD views and does some impressive stuff. Unfortunately, I find that I can not relate to a lot of conversations when discussing project structure. I’m unsure if this is due to the size of my project or because there are obvious things that I am missing.

Of all places, it was a sponsored post in @wsvincent and @jeff’s Django News newsletter that I saw the HackSoftware (never heard of them) Django style guide for enterprise applications. There is a fair bit of stuff in here that I think that I disagree with. However, this document touches heavily on the concept of including a services layer in a project, which would encapsulate all business logic / actions outside of a view/form coupling, with the intent being that only the services layer interacted with model / ORM code.

The benefits of this are obvious, it’d be much easier to test business logic in the (fat) services layer without concerning yourself with views, forms, or most of the ORM. This would be especially advantageous to me as it’d encourage better test hygiene in my project. It also allows for easy re-use, yada yada.

This is not really groundbreaking. It’s a common pattern and obvious if you have an intuition for separation of concerns. However in the world of Django where the usage of things like ModelForm and generic class-based views seem to be quite the norm, it is a bit of a departure. This is where the downsides come in. Things like generic class-based views (esp. database mutating views like CreateView, UpdateView.etc) muddy the waters and strictly speaking could not be used in their default states. Things like ListView seem like they’d be fine. The magic of ModelForms — or at least the default save() method—could not be used. These seem like they are big time-savers to me at the moment.

I recall seeing an article (or maybe it was a conference talk or email) by Tom Christie about such an architecture, where he similarly advised against using Django REST Framework ModelSerializers and instead suggests using regular Serializers for mutating data via your services layer. This similarly seems like it would introduce a lot of extra work, and the DRF documentation seems to suggest that ModelSerializers and generic CRUD ViewSets are the way forward. A services layer seems like it’d break this. Have I understood the intent? Are these shortcuts not meant for me?

Does anyone have any experience writing complex Django applications with a services layer as discussed in the linked style guide? How has it gone for you? Has it saved more time than it’s spent having to override or reimplement Django’s provided batteries?

Thanks in advance.

2 Likes

Hi Kye,

Welcome to the forum. And what a post to start it off!

  • is there any publicly available (paid is ok) literature or training available that covers how to best structure a large, ongoing, mature Django project?

Here’s a quick saturday braindump…

Octopus Energy have a few posts on their ideas of structure, plus their own style guide, see their tech blog: https://tech.octopus.energy/

This is probably the Tom Christie post you were thinking of: https://www.dabapps.com/blog/django-models-and-encapsulation/

This recent forum discussion covers many’s approaches to apps and structure: Why do we need apps?

Dan Palmer at Thread on their 350 django app structure: https://danpalmer.me/2018-03-02-scaling-django-codebases/

This book from Harry Percival and Bob Gregory that is in “beta” and can be read for free seems promising, quite a few people mentioned it to me/on Twitter in the last couple weeks: https://www.cosmicpython.com/ . It covers more general Python application design, from their starting point of wanting to make things testable.

  • Does anyone have any experience writing complex Django applications with a services layer as discussed in the linked style guide ? How has it gone for you? Has it saved more time than it’s spent having to override or reimplement Django’s provided batteries?

Where I worked at YPlan we had some “services” layers but they weren’t enforced as the only architectural pattern.

I think you’ve hit the nail on the head that it can be a lot of time reimplementing “batteries”. Django’s forms and models layers are a bit more intertwined than I’d like, but this is a time-saving measure when getting things off the ground. Likewise with many other shortcuts that Django offers, they can come back to bite you when there’s a lot of code going through them.

I’m not sure there’s a general purpose answer out there at the moment. I’m not sure many of the larger Django projects feel like they’re “there” either.

4 Likes

Thanks for the resources Adam. Certainly a lot of helpful stuff here.

For anybody following along, https://github.com/octoenergy/conventions/blob/master/patterns.md is particularly compelling.

1 Like

Relevant hot take from @ubernostrum :

Well now I don’t know what to think! :joy:

1 Like

I will probably write something longer in the near future, but I have several concerns about people advocating service layers; some are inherent to the pattern, others are Django-specific, and it’ll take more words to explain fully.

I’m also interested in the topic. I’ve never found the definitive answer to this question, and I feel that there is no one.

I’ve also read the “Django style guide for enterprise application.” Although there aren’t many things I could happily agree with, I love exploring how people do the business logic in Django and their approach was quite interesting.

I would happily listen to James’ opinion on service layers because I also don’t quite feel them and want to verify my thoughts.

As for business logic, I recommend reading this blog post: https://sunscrapers.com/blog/where-to-put-business-logic-django/. The author outlines the pros and cons of four popular approaches to doing business logic: Fat models, Views/Forms, Services, and Model managers.

Definitely interested in this thread.

Thanks so much for sharing amazing resources, just started to go through them with a hope of building an opinion about how to proceed with large/mature Django code bases.

If interested in Domain Driven Design

  • My org has recently started implementing Django code in DDD (Domain Driven Design) format and as Backend Developer I have started picking some knowledge around it in recent times (past couple of months) given the drive to move to DDD at work.
  • However the general sense I have so far built looking at the ongoing implementations (at my workplace) is that it has largely convoluted the code base and debugging has become way too painful since there are layers and layers of code before we finally reach the actual function that is operating as the brains behind what someone was trying to debug/fix.
  • But I kinda believe that this is due to my lack of familiarity with DDD as a design let alone being familiar with its code structure and this belief keeps pushing me to try and learn better ways to implement DDD.

Not so fun part of this initial learning/familiarizing curve:

  • I am currently in the middle of refactoring (think redesigning) some part of our platform and as a team, we are approaching building in DDD format.
  • As a result, I am currently scratching my head around understanding terms like aggregate, aggregate roots, entities, value objects, etc. and more importantly, how to convert this understanding into working code.
  • This is with a hope that I can clearly define these things in my LLD (Low Level Design) document and get to writing code to start implementing them.

Now with more words:

https://www.b-list.org/weblog/2020/mar/16/no-service/

3 Likes

Fast writing @ubernostrum ! And very clearly laid out arguments as always.

1 Like

James, thank you for the post. I’ve been waiting for something like that. You’ve just confirmed my approach that has crystallized during the years of working with Django: don’t try to bend Django abstractions - they’re well thought, instead, embrace them!

This is super helpful. I really cannot thank enough @ubernostrum :raised_hands:t3:

Some queries as I am trying to cement my understanding of @ubernostrum’s post:

Query 1:

The general idea is to try to put layers of indirection/abstraction in front of various components, so that the actual implementations of those components can be changed without breaking other code.

When it says layers of indirection/abstraction, here’s what I understood from this, please correct me if I interpret this incorrectly.
Additional layers of abstraction which results in indirection within the code, i.e. indirection between services/interfaces/etc. which results in too many calls being made to reach the final logic that is responsible to do a certain thing (which we aimed to reach to)

Now combining above phrase with first part of sentence:

The general idea is to try to put layers of indirection/abstraction in front of various components

What exactly is being recommended when we say in front of various components in the above sentence. Is it possible to share code example of something that does this vs that does not do this or add explanation in slightly more naive words?

Query 2:

What’s much easier, and can be done quickly to clear out your cards for the current sprint, is copying someone else’s already-designed API, which means the service layer, over time, almost always ends up with an API that’s tightly coupled to whatever the underlying data layer’s API is, and there goes a lot of the claimed benefit of the service layer out the window.

Here, it seem we are assuming that most of the developers who are trying to build service layers (in Django or otherwise) usually end up copying it from some other previous (known/accepted) implementation instead of rethinking what’s best suited for the current problem at hand, is this fair to say?

If so, then we are kinda saying its not applicable to those who actually think from ground up when designing their service layers instead of copying from previous implementations (of course other things considered in the post when it comes to using Service Layers with Django)

Thank you again.

Unlike most other people in this thread, I basically always write a service layer for even the simplest projects. I’ll try to offer insight on the service layer as someone who worked on a large project before it had a service layer and after it had one.

I look at Django less as a framework, and more as a collection of low-level libraries upon which to build our own abstractions to match our use cases. The ORM doesn’t know about our own access patterns! The router doesn’t know about our URL naming schemes! But they do handle a lot of the inner bits very well.

At one point in the past we were passing around raw query sets. So in our document list view we would do stuff like

    documents = Document.objects.filter(company=request.user.company)

Then we started introducing data access control. Some users could see all the documents, but some can only see based on whether they belong to certain user groups. So your filtering becomes something like

documents = Document.objects.filter(
             company=request.user.company,
             group__in=request.user.groups()
)

Now, there are other places where we would get documents. For billing/analytics purposes we would get them. In some places in background tasks we would get other groups. And in each of those call sites there would be a different meaning behind Document.objects.filter(company=request.user.company).

So now, sometimes years after someone wrote the initial ORM query, someone has to go in and decipher the “true meaning” of a query.

So now we do the simple thing and make sure that people who write the initial ORM query give it a name. This might feel like busywork but this is Python, not Java. It only takes a couple lines per query. And later refactors are much easier. This is our “service layer”, so to speak.

    # for "standard" tasks like "get data visible by user"
    # we use short names 
    def get_documents(user):
        return Document.objects.filter(
                         company=user.company, 
                         group__in=user.groups()
        )
      
    # longer names with rarer params for queries
    # that aren't used as often
    def get_all_documents_in_company(company):
       # for tasks by administrators  
        return Document.objects.filter(company=company)

    # and then more specialized queries for other tasks
    def get_current_document_count(company):
        return get_all_documents_in_company(company).count()

And for stuff like creation, Python keyword args make this sort of stuff super straightforward because you can just pass along all the data at first. 2 line functions mean that you will then have a single refactor point when you decide to add a notification system when creating a document

    def create_document(**kwargs):
        return Document.objects.create(**kwargs)

And now you can have piece of mind in knowing that every time some object is created, it’s going through the “right” constructor.

Ultimately ORM methods don’t describe intent nearly enough, so it’s very hard to refactor after the fact, since you are missing so much context. Service layers in python are cheap and ultimately you don’t need that many entry points for most systems, even really big ones.

Service layers also allow you to do stuff like easily share validation between serializers and Django forms, have more unified error handling strategies across your codebase, and just generally make stuff easier to refactor.

The alternative is to have much less confidence in extremely important things like access control, data integrity, and query correctness. Django often does not provide the tooling to capture all business logic, but writing quick wrapper functions can solve that for you. And having broken data is a much, much worse problem than having to import an extra couple of functions.

To be clear, Django does have well thought out tools for stuff, and this isn’t saying “throw it all out”. We still use form.errors. But you can still use a service layer to share the right things between components, and let Django handle just the lower-level parts of things.

Your examples don’t seem to require a “service layer” in any way, though. For example, your Document model’s manager could implement a get_for_company() method that expresses intent just as clearly as your get_all_documents_for_company() function. Or the Company model could implement a get_all_documents() method (or if, as seems likely, there is a database-level relation between them, a given Company instance probably already has a documents descriptor you could use).

So I’m still very very skeptical of the alleged need for creating another intermediary layer, and I become more skeptical the deeper I get into your post. When you say, for example, “you can have piece of mind in knowing that every time some object is created, it’s going through the “right” constructor”, I’m confused because someone who would mistype a constructor name implemented at the model/manager layer could just as easily mistype a constructor name implemented in the service layer. So if you are afraid developers will call the “wrong” constructor, this does not solve your problem.

And there are plenty of patterns for adding extra pre/post processing logic in methods – for your case of potentially wanting to add notifications on creating documents, you can implement them in save() on Document, for example. Exactly where and how to do this will depend on exactly what you want to do, but again there is nothing here that requires implementing a service layer to accomplish. Which gets back to what I wrote about: service layers create extra maintenance burden for, as far as I can see, no real benefit, since the things people say they do in service layers can always be done easily (and, yes, even preserving “intent” in method naming) without the extra overhead of the service layer.

1 Like

I don’t want to get into a deep argument about this (I wrote about this with a different example a couple years ago), but there are of course a bunch of hooks in a lot of places. And Python’s turing-complete!

I think that a lot of the hooks are error prone in real-world examples. For example “send notification when an object gets created”, the simple implementation is "stick something into save(). Except most people write it and forget to do the if self.pk is None check, so then edits are also doing it. Or in the model form you forget to check if commit is True,and now you’re running a side-effect in some validation code that calls form.save(commit=False).

This is all subjective, and I still put some stuff into save methods. But I would choose having create_document and update_document over save with self.pk checks almost any day of the week, purely from a “code aesthetics” perspective. And in practice this separation has prevented classes of issues we used to have.

For me, putting it on the model manager is the same as the “service layer” described elsewhere. The core thing for us is to avoid too many raw calls to the ORM except in the lowest layer.

EDIT: to be clear, despite all of this, we’re all stil able to use DRF serializers, automatic model form field derivation, and loads of other goodies to great effect. If you were to look at our code, the main thing you would see is just a quick method override of save and maybe some validation methods, but otherwise our code looks the same as “typical” Django code.

My team’s reading club just discussed @ubernostrum’s articles, and I dug into this thread and a couple of similar ones on this forum. Great stuff, and I appreciate all the perspectives. We lean towards “letting Django be Django”, but are still not sure on where to put complex logic that involves multiple models. I believe @ubernostrum attempted to address this somewhat in his follow-up article with the mention of using architectural patterns like pub/sub, but I guess I’m just not sufficiently versed in those.

One thing that I’d find helpful is is an example of a non-trivial project that avoids a dedicated service layer, i.e. by following the suggestions of @ubernostrum, @andrewgodwin, and Tom Christie. It seems there are more examples of projects that follow a service layer style guide.

1 Like

Hello everyone,
I need suggestions in developing such a complex project. I am following this blog
I have two projects separate projects.

  1. User authentication: can be used as a rest API service. It contains a user database, authentication method.

  2. MainProject -> Frontend application: This will utilize the user authentication API.

In frontend application, I created HTML templates (user sign in, signup, etc)
When a user registers and logged in 1 will provide me a JWT ticket.
My question is how to keep track of the user on my frontend application based on this ticket?

sorry for my bad language!

Hi! Welcome to It’s the forum.

First I’d like to say you’re doing quite well here, and you don’t need to apologize for your language!

Second, I’d like to suggest you open this up as a new topic rather than extending this one. It’s not really related to the previous discussion and may deserve its own thread.

Ken

1 Like

Thanks, dear, I will create a new topic on this.

Complex logic that involves multiple models is where I start leaning towards an extra layer between views and models.

The main product in our company is a financial data analysis dashboard built with Django. Most of the data is written during the night when we update some asset prices and other relevant data. Our views are basically just reads on the database and calculations on the data. Not a lot of CRUD functionality.

When we need to calculate something that involves querying many tables is where I start getting a bit lost in traditional Django world. How does the “no extra layers” crowd go about organizing this?

We recently decided to write modules called use_cases.py, essentially a very simple layer between Views and Model/Managers/Querysets. Inside a UseCase, some parameter validation occurs, manager methods get called for queries on all relevant models, then some numbers are crunched, and results are returned back to the views (or whatever is calling the UseCase).

I feel I’m probably missing a lot of Django wisdom/experience here (i.e. how to solve this the “framework” way), but this layer has proved very useful so far. If any more experienced Django folk want to chip in with their two cents here, I’d appreciate it immensely.

PS: @ubernostrum @rtpg I read your articles, and wanted to thank you for taking the time to write them. Really valuable for developers in small companies like myself without huge teams to discuss these issues with. Thanks!

2 Likes