Hi folks,
I shared a post a while back about using Django and Jupyter together, because in many cases the python methods defined on a model can be really helpful for doing analysis of the data stored in your database. It’s linked below:
However, I found Jupyter could be a bit of a pain to work with sometimes, because while it’s very powerful, there’s also a fair amount of UI to learn and navigate, and the format any sessions are saved in is an ipynb
notebook - by default isn’t very git friendlly, and the results of your queries are also saved in the format. You don’t always want this.
Enter Marimo
I recently came across the open source project Marimo - if you have ever used Observable with their nifty “reactive notebook” idea, I think you’ll be fairly comfortable with Marimo too, as you can think of being somewhat like the Pythonic answer to Observable.
With reactive notebooks, it’s much harder to end up with mysterious amounts of state hidden, making reproducing notebooks a real pain. There’s a number of common issues people have with notebooks, and they’re outlined on their docs page
Anyway, here’s how the makers of Marimo describe it:
Highlights.
reactive: run a cell, and marimo automatically updates all affected cells and outputs
interactive: bind sliders, tables, plots, and more to Python — no callbacks required
reproducible: no hidden state, deterministic execution order
deployable: executable as a script, deployable as an app
developer-friendly: git-friendly .py file format, GitHub Copilot, fast autocomplete, code formatting, and more
marimo was built from the ground up to solve many well-known problems associated with traditional notebooks.
Source: marimo docs.
This looks like it might be a nice tool for doing data exploration work with Django, and because you get lots of nice widgets for inspecting data when like you might with Jupyter, and the result of a saved exploratory session is just some python that can be run unattended, like any other script.
It feels like it might be an interesting middle ground between messing in a django shell
(or shell_plus
) session, and writing a full blown django management command.
For example, to start a notebook session, assuming you’ve added it to your dependencies you would call the following to start a session:
marimo edit
That’ll give you notebook UI, and you can get access to your project by
import marimo as mo
import django
django.setup()
From there you can then interact with the ORM, from a notebook sesion, as if you were making queries using the django shell.
The advantage is that you get all the nice widgets and rendering of data provided by Marimo (their page shows loads of it). So you can query like so, and the like a notebook, the last value in a cell will be displayed.
from apps.my_app.models import SomeModel
Somemodel.objects.all()
Working with subfolders
One thing I’m appreciating is that using Marimo, I’m able to have all kind of saved notebooks and analysis files saved in a separate folder called analysis
, which I can keep under version control and have it pushed to a separate remote repository to the main project repo.
From the project root, I just run the following, and I can interact with my data using the all the methods and models defined in my project codebase.
marimo edit ./analysis/my-analysis-notebook.py
My only extra dependency is the marimo
library, which you’d only need to include in a development dependencies file. Considering the functionality it offers, doesn’t rely on that many external dependencies in its own right - useful to know.
Anyway with this approach, I’m able to have an open source django project, and a private analysis repo, where I feel much less worried about the two mixing, and me accidentally leaking data into the open repo for example.
Has anyone else used it and had either good or bad experiences with this yet?
I’m really impressed, and I’d find it useful to compare notes with others.