Hi,
I have to change my db backend from postgresql to mariadb.
dump data from existing postgresql db:
a) manage.py dumpdata > outfile.json ( filesize of outfile.json > 5GB )
change settings.py to mariadb db backend
a) manage.py migrate --run-syncdb
exclude contentype data with:
a) manage.py shell
b) from django.contrib.contenttypes.models import ContentType
c) ContentType.objects.all().delete()
d) quit()
manage.py loaddata outfile.json
I got an MemoryError, so my idea was to increase RAM memory
on the virtual machine from 4GB to 16GB, and run the command
again. The process for loading data into the new mariadb backend
was starting correctly, and everything seems ok.
After 4 - 5 minutes the loaddata process died!?!?
Maybe the json-file is too large??
Is there another way to change db backend from postgresql to mariadb?
You can include or exclude specific apps and tables from your dumpdata command. You could do this dump in multiple steps, selecting specific tables or apps to dump in each step.
This can let you reduce the size of the individual output files, making them easier to handle. (Yes, you would need to ensure you handle the sequence of dependencies correctly.)
Of course, how much hassle this becomes depends upon whether it’s 1 GB of data in 20 tables or 1 GB of data in 200 tables.
This isn’t the only Django-managed table that you may need to be concerned about. It might also be necessary to empty out the User, Group, and Permission tables.
BUT, were I in a position to need to do this, this is not the method I would take. Depending upon just how complex the migration would be, I would try one of a couple different methods:
Outside of Django - Use the pg_dump command to dump the database as a set of SQL INSERT statements, and run that as an import script on the MariaDB system.
thanks for the realy good answer and suggestions!
first I tried the the described method ( dumpdata - change settings.py - loaddate )
again with 32 GB memory on the virtual machine. Nothing went wrong, but
the process took 6 hours … 12, 24 … 36 hours … but nearing completion
a Integrity Error appers and kills the manage.py process.
Now I give the next method a chance. I will dump data for each table (60 tables) to
its own dump-file. Take a look to the sequence of dependencies, and
load this data step by step into the mariadb.
But first I made a sketch about the dependencies between the tables …
Your suggestion to dump and load every table separate works fine.
I also make a dump of the contenttypes, user, group and
permissions tables.
Now all data from postgresql-db are in mariadb … also the
django-managed tables. I import the data-tables from the
apps in the same order as they appears in the models.py file.
After that I import the contenttype, group, user and permission tables.
But one thing is not clear … what about the relations between
two tables? For example, if I have a cat-table and a color-table, then
usually I have a third table: car_color.
So, if I have to change the db-system, I dump data from car, color and car_color table.
In the target db-system I have to import color-table, car-table and
the relation-table car_color. Otherwise nobody knows what car have what color.
But, if I import color and car table everything is done … without import
the car_color table!?!?
It depends upon how you’ve defined your models for the Many-to-many relationship. I’m not sure of all the conditions, but I’m pretty sure that if you’ve defined the many-to-many using the “through” parameter, you’ll find the through-table data in the same data as the file where the definition is made.