Have any of you come across any helpful openly licensed examples of open source django applications that come with openly licensed sample data for teaching?
I’m looking for a sample project to demonstrated the Django ORM with something approaching realistic data, and in other ecosystems I might reach for something like Northwind for Microsoft Access, but I’d love to have some other options available too if possible.
What’s Northwind?
Django Northwind is a version of the Microsoft Northwind sample database.
The Northwind database is an excellent tutorial schema for a small-business ERP, with categories, customers, region, territories, employees, shippers, suppliers, products and orders.
The quote is from a django project where someone seems to have adapted the Northwind dataset to Django, and this would be a nice example dataset you might use for teaching.
Something more fun?
It is a bit dry though - if you have come across any similar projects, that already lend themselves well to messing around in the ORM to show how it works, do please chime in.
I’ve found this example, which is more about music, and presumably would leave you with a sample postgres database that lends itself well to demonstrating handy queries in the Django ORM:
This project creates an ETL pipeline that makes song data available for the analytics team at the startup Sparkify to understand what songs users are listening to. Currently, they don’t have an easy way to query their data, which resides in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in their app. This project creates a Postgres database with tables designed to optimize queries on song play analysis.
BTW - I did consider using Faker, or even taking a real database snapshot and using some synthetic data generator to a sort of anonymised version for training and development.
I assumed there was a degree of prior work gone into the data modelling that I was hoping to build upon. I’m also aware there are tools that can take an existing dataset and create a pseudonymised synthetic version, but I have zero experience with them so far.
@tom has set up django-admin-demo for our accessibility work, with a readily-available DB directly in git. The data is from the Spotify API so licensing isn’t super clear to me but I assume it’s plenty acceptable enough for reuse. There are multiple models with varying relationships which I think would be a reasonable place to demo the ORM.
Wagtail’s bakerydemo demo site has about 100 or so instances of models in there. They’re not super complex but still a decent demo I think. Content is either from Wikimedia Commons, or our contributors, or a few other public domain sources.
In both cases I have static copies of the content if you want to take a look: