DEP0009/ORM implementation plan

I think every language system with the “we have language-level support for coroutinees” like this is struggling with this problem (or, in JS’s case, just punting on it by saying “everything’s async now”), so at least we’re probably around the status quo on this.

The only ecosystem I can think of that doesn’t have this problem is Purescript, because you can abstract over the effect type (Aff/Eff…), which is a little bit of a hurdle for Django to say the least!

Having said that, one thing that I’ve felt is that I think the problem gets better, not worse, over time. Like we have all this existing code that we have to deal with, but new code can be written with effectful separation in mind.

At the end of the day if we are saying that .get and .aget are different things, and that .aget behaves differently depending on whether it’s in a new_connection context manager or not… saying “this needs to be 3 tests” feels like the default stance.

For some of this I’ve found it possible to just write one test and then parametrize it over “wrap the test in new_connection”.

But if we can extract parameter prep steps in particular, then a lot of the early-return stuff probably doesn’t need to be covered with 3 different tests! We can write a unit test on the validation, and have our one effectful test, and be done with it.

So my belief here is I have this changeset that is aiming to get very close to 100% coverage. Once I have that I will try and figure out how to chop it down. There’s a couple of refactoring of internals that could make it more straightforward for us to have coverage. Bit of elbow grease, but that’s it.

And once we have some foundation, we can move forward. We’ll have a one-time hit in messiness, but I don’t believe it will accumulate over time.


PS If you’re interested in the current state of things, this branch includes my work so far. There’s stuff in there that is for my debugging purposes (and also some not-yet-merged things like this PR and of course the async cursor PR), but it successfully has let the following take 1 second instead of 5 seconds (when I have 5 clients in this artificial test)

async def overall_summary():
    # look up all clients
    # for each client fetch all the invoices (with an extra bit to make it nice and slow)
    clients = [client async for client in Client.objects.all()]
    results = {}
    async with asyncio.TaskGroup() as tg:
        for client in clients:
            results[client] = tg.create_task(get_invoice_summary(client))

    results = {str(k): v.result() for k, v in results.items()}
    return results
async def get_invoice_summary(client: Client):
    async with new_connection():
        my_invs = [
            inv
            async for inv in Invoice.objects.filter(client=client).annotate(
                other=RawSQL("select pg_sleep(1)", [])
            )
        ]
        return {"count": len(my_invs)}

3 Likes