So I cant find any conclusive information on this one, so hopefully someone more knowledgeable knows this one.
So I need 6 tables, each containing anywhere from 5 to 100 records. And around 10-20 fields in each.
This data would be used frequently as part of a random generator and i want to try and make it run as optimally as possible from the get go.
So that said, is it worth putting this data into a database at all or just write that off in favor of hardcoding some dictionaries to handle it?
My first question would be to ask why you’re concerned about it running “optimally”. Or to try and be a bit more specific, what do you mean by this?
The reason that I’m asking is that whether the data is in a dict or in tables, they’re still going to be used to create Python objects before anything is done with that data - and that only needs to be done once when you’re initializing your process. Even if you’re generating millions of rows of data for your generator, you’re not loading the data millions of times. As a result, it’s not going to make a practical difference to your process which option you choose.
So this generator will take in 5 objects, do some work with them to make one object and end up saving that to a database.
So perhaps what I mean isn’t optimally, perhaps more this: Is running 5 select queries and an insert to the database each time this code runs any better or worse than using 5 dicts stored in Python from the get go?
I get that this isn’t a lot of data to manipulate but I wasn’t sure if hitting the database that often was good design
It’s not necessarily a good or bad design. There are pros and cons either way, and the “best” decision is generally going to depend upon other environmental or contextual conditions.
If you’re doing this once / hour, it’s not going to matter. If you’re doing this once / minute, it’s most likely not going to matter. If you’re doing this once / second, it may matter. If you’re doing it 10 times / second, it will matter.
Keep in mind that your database will cache recently used data, and if the tables are small enough, they’ll probably stay memory resident the entire time your Django process is running. You might be hitting the database, but you won’t be creating any disk IO. However, if you’re doing this frequently enough, you’ll still want to avoid the communication between Django and the database engine - in which case you’d likely want to cache the data within Django.
So again, the decision shouldn’t be one of “optimization”, but of usage, management, and maintenance. The big advantage of putting the data into your database is that it can be reviewed and edited without having access to the source code or redeploying an app. That’s a far more critical item than any milliseconds you might save when your app starts running.