I am about to embark on my first Django App and i wish to use a CSV db import for getting data in.
Having watched a couple of videos on models I’m still not sure how to deal with haveing 2 header rows.
Do I create a process model with name, description and freq, and then another table for Die and associated sub items and then another table for stuff? Bearing in mind that each row is unique by ID.
Yes I realise this is more DB than Django specific but I am sure some one else has come up against this.
All help gratefully recieved.
Thanks
Could you tell us a bit more about your business rules? You could just have a single denormalized model that would reflect exactly what you have int he CSV, but I doubt that’s what you want.
For example, what are the rules around the die columns? Does each row need to have a unique set? Would you ever add another color die such as yellow?
The same questions for “Process” and “Stuff” sub-columns. What are the expectations/rules there and will they change? Or could the requirements change easily enough to justify making your data model robust enough to support that?
Hi Tim,
Thanks for replying. As always its horrid when you don’t show the complete data so here is the input csv sheet all marked up with dummy data
Basically it is a risk and controls assessment sheet. The users fill in the sheet and I want to analyse the data and display it in Django. possibly with some drop downs to do modelling or canned queries. not quite decided yet.
So first 2 columns under process are for description and title.
The Risk ID would be my primary key. The risk section deals with the initial risk quantification and results in the user entering an Impact(I) and likelihood (l) rating.
Controls is how we set up the risk mitigation and again we end up with a net risk that should hopefully be lower than the gross risk. Again this is a net Impact and likelihood
The issues and actions is for us to work on the risk or put plans in place.
My initial thoughts were that each line item can be referenced by its Risk ID OR Control ID which should be associated with the risk ID. One risk could have many controls. For each failing control there should be an issue action. I will insert and action_id to track this.
SO that’s my real data structure . I wanted to know how I would model that in Django
Thanks for your help.
Tom
Yarp, that’s quite a file. So generally here’s what I’d do. Avoid making the primary key’s of your models match the IDs in the file, but definitely do make the models fields that match the files ID columns unique fields.
I’ll try to describe my process for breaking down a data model from a csv file. Other’s may have different/better processes/ideas. It all stems around identifying the denormalized relationships. You have different levels of these, so let’s take the Risk section. You have the following columns:
- Risk ID
- Risk Owner
- Risk (Title)
- Risk Description
- Risk Category 1
- Risk Category 2
- Risk Category
This could be one model, but should likely be three. I’m going to ignore the one model solution, if you think that’s what you need and are confused, please let me know.
class Department(models.Model):
name = models.CharField(unique=True) # Maps to Control Owner, Risk Owner, etc.
class RiskCategory(models.Model):
name = models.CharField(unique=True)
class Risk(models.Model)
code = models.CharField(unique=True, ...) # Maps to Risk ID
owner = models.ForeignKey(Department)
title = models.CharField(...)
description = models.TextField(...)
categories = models.ManyToManyField(RiskCategory, ...)
If your categories need some hierarchy structure, things get a bit trickier. You’d need to answer how many levels of hierarchy you need to support. Infinite is a viable answer as well. Regardless though you would need to know how that’s represented within this file and decide how to handle situations where a category has conflicting parents (or if that’s also a valid case).
I’m going to stop here and let you try to extrapolate out from here. If you have more questions or want me to clarify something, please let me know.
Thanks Tim,
That is very illuminating.
The only hierarchy is the Risk Categories
The tor lebvel is a group of categories, the second level is a nother list of sub categories adn finaly list 3 is our categorization… it all cascades dwon thil.