Phoenix explicitly encourages an internal service-like layer with Contexts. I really like this approach in theory, though I haven’t written anything serious enough with Phoenix to pressure test it. As a side note, I also like how the web/REST interface itself is treated as a separate Context in Phoenix.
At edX, we’ve done some work around having service interfaces declared in an api.py
file to give a service layer, but I keep wanting to reach for a better solution. I just read the docs for django-api-domains linked above, and I broadly agree with the high level concepts of having an explicit service layer, but I feel like every time I’ve tried to walk down that path in my own attempts, I end up with something clunky and redundant. Please understand that I think Django is great and I believe that Open edX owes much of its success to the framework. I think it’s very reasonably put together. It’s just that I can’t line up all the pieces in a natural way in my head for this use case.
If I could magically have everything I wanted, our codebase might look like:
- ~10 domains, each having potentially ~10 apps inside (total codebase ~1M lines)
- Each domain has multiple public in-process service interfaces (one per named release), to make upgrade cycles for third parties easier.
- Each domain has some set of data structures, and signals that can be hooked into.
- There exists some straightforward pattern for third parties to extend by associating their own data with something in the domain. For instance, say my custom app wants to be able to add something to a course enrollment, so I can pull that up without going through N+1 queries.
- REST or GraphQL APIs could span multiple domains, and talk to them via their service interface.
- We should leverage existing work as much as possible–I don’t want to introduce something that looks alien to other Django devs or that requires a lot of work to maintain.
- Django Admin might be nice, at least for simple operations, but I could live without it.
My goal would be to consolidate business logic and maintain a smaller, more stable public API for other domains and third party apps. Hiding Django (“the framework is an implementation detail” philosophy) would not be a goal in and of itself. I want to add as little as possible to achieve those goals.
But here I start running into some issues:
Where does validation happen?
We don’t really use Django forms, preferring to use React frontends and DRF for the REST APIs they talk to. Put it in the model’s save()
? That’s less than ideal, and hard to bubble up the exact errors. DRF has some nice validation/serialization facilities, but that’s supposed to be at a layer outside the domain’s service. Make something in the domain service’s public API to do validation that happens to trivially convert to something DRF can use?
Maybe the REST API layer is really a peer to the service layer and not a layer on top of it, and so has access to all the internals? But how would cross-domain APIs work? And would we start to see drift in validation/logic rules between the REST API and service API?
Hiding the models
We don’t want to pass models around through the public service interface because we want more freedom to change how we store the data behind the service, and we also don’t want to pass around a giant API window that allows people to arbitrarily query the database in completely unpredictable ways.
So maybe data classes, with translation happening at the service layer? But again, DRF has all this nice stuff that expects to work with models. I mean, basically everything in the Django ecosystem has nice things that work with models directly because they expect to be the interface between models and the browser making requests. So I can see how it would work, but it feels a like you have to leave a lot of conveniences to get there.
Efficient Querying
N+1 queries are one of the banes of my existence. But avoiding them means some level of model relationships. Possibly create a limited API here that allows you to key off of the primary key of another Domain’s model? This sets off all kinds of coupling alarm bells, but there are both performance and data integrity arguments for having something at this layer. This could probably be limited to only one or two models per domain, for the most common things that are hung off them (e.g. courses, enrollments, students).
I’m working on a little side project to experiment with these concepts. I’ll definitely be looking at django-api-domains and dry-python while I’m exploring ideas. Thanks folks!