change db backend from postgesql to mariadb

I have to change my db backend from postgresql to mariadb.

  1. dump data from existing postgresql db:
    a) dumpdata > outfile.json ( filesize of outfile.json > 5GB )
  2. change to mariadb db backend
    a) migrate --run-syncdb
  3. exclude contentype data with:
    a) shell
    b) from django.contrib.contenttypes.models import ContentType
    c) ContentType.objects.all().delete()
    d) quit()
  4. loaddata outfile.json
    I got an MemoryError, so my idea was to increase RAM memory
    on the virtual machine from 4GB to 16GB, and run the command
    again. The process for loading data into the new mariadb backend
    was starting correctly, and everything seems ok.
    After 4 - 5 minutes the loaddata process died!?!?
    Maybe the json-file is too large??
    Is there another way to change db backend from postgresql to mariadb?

You can include or exclude specific apps and tables from your dumpdata command. You could do this dump in multiple steps, selecting specific tables or apps to dump in each step.

This can let you reduce the size of the individual output files, making them easier to handle. (Yes, you would need to ensure you handle the sequence of dependencies correctly.)

Of course, how much hassle this becomes depends upon whether it’s 1 GB of data in 20 tables or 1 GB of data in 200 tables.

This isn’t the only Django-managed table that you may need to be concerned about. It might also be necessary to empty out the User, Group, and Permission tables.

BUT, were I in a position to need to do this, this is not the method I would take. Depending upon just how complex the migration would be, I would try one of a couple different methods:

  1. Outside of Django - Use the pg_dump command to dump the database as a set of SQL INSERT statements, and run that as an import script on the MariaDB system.

  2. Using Django - Add a second database definition to the DATABASES setting, giving you concurrent access to both databases, and write a custom django-admin command to read models from one and write them to the other.

Unless you’re willing to put in a lot of work with managing your dumpdata files, I think you’re going to find these methods to be more robust.

thanks for the realy good answer and suggestions!
first I tried the the described method ( dumpdata - change - loaddate )
again with 32 GB memory on the virtual machine. Nothing went wrong, but
the process took 6 hours … 12, 24 … 36 hours … but nearing completion
a Integrity Error appers and kills the process.
Now I give the next method a chance. I will dump data for each table (60 tables) to
its own dump-file. Take a look to the sequence of dependencies, and
load this data step by step into the mariadb.
But first I made a sketch about the dependencies between the tables …

Your suggestion to dump and load every table separate works fine.
I also make a dump of the contenttypes, user, group and
permissions tables.
Now all data from postgresql-db are in mariadb … also the
django-managed tables. I import the data-tables from the
apps in the same order as they appears in the file.
After that I import the contenttype, group, user and permission tables.
But one thing is not clear … what about the relations between
two tables? For example, if I have a cat-table and a color-table, then
usually I have a third table: car_color.
So, if I have to change the db-system, I dump data from car, color and car_color table.
In the target db-system I have to import color-table, car-table and
the relation-table car_color. Otherwise nobody knows what car have what color.
But, if I import color and car table everything is done … without import
the car_color table!?!?

It depends upon how you’ve defined your models for the Many-to-many relationship. I’m not sure of all the conditions, but I’m pretty sure that if you’ve defined the many-to-many using the “through” parameter, you’ll find the through-table data in the same data as the file where the definition is made.