Hi,
I’ve been working on an app now for a while and have many tables and FKs. Over time my app has changed and now looking to rewrite the app but using the data from the existing application on a new database/server
I’m using PostgreSQL and using pgadmin I’ve backed up and restored the database to my new app, but cause the app has different names all the tables have been prefixed with the old app name as well as constraints/indexs referencing the old app name table prefix.
What is the best method to handle this?
I also tried using some 3rd Party tools like DBeaver which almost works, but I end up with issues with IDs in the columns where there have been deletions and gaps in IDs. For example, I’ve deleted records 1 - 700 in the original database, and now that I have restored them, new data is added using ID 1 - 700 but on the 701 updates, I get an error cause it’s already in use for the original data.
any help would be great.
Thanks
Tommy
When we’re moving or converting data in situations like this, we use the Django dumpdata
and loaddata
commands, along with specifying natural keys for all models having relationships with, or are related by, other models.
Then, if the model names have changed, you can do a global search-and-replace to replace all instances of the original name with the new name.
Thank you, Ken.
One of my tables has over 13million rows, which is probably small in comparison to many django apps, but will dumpdata work for this? It looks like it outputs to json
?
JSON is one of the output formats. It’ll also output yaml. (We prefer yaml because it comes out more readable initially. We need to run a json formatter for json dumps.) I think it can also emit xml, but really, xml? (Gag)
I see no reason why it wouldn’t be able to dump a table that size. It doesn’t take much to try it and see.
Ok, this takes a fair bit of time to run, do you know if there is a way to capture the deltas?
I have data being captured every 15mins - probably a better way to handle this, but for now, if I run the initial dump is there a way to capture the missing data?
Intrinsically within or by the database? No, I don’t know of any.
If this data being captured has some current-time-based attribute that you can tell what the last one was that was captured, then you might be able to do something along those lines.
Another possibility is to clone the database during a window where no data is being captured, and do your extract from that. Then repeat the process later and then run diffs between the clones.
But I’m not aware of any easy way - and likely, every process you come up with at this time is going to require some “down window” when you’re looking to do the cutover and will end up having one-or-more 15-minute windows unavailable until the cutover is complete.