Django - Client Encoding - psycopg2 - PostgreSQL - Latin1

Hello,

we want to develop a Django Application for an old existing Application. The Data Model has to been the same and we want to use the same Database.

The Encoding is latin1 and the PostgreSQL Version is 9.2.24.

As mentioned on following Page, Django supports PostgreSQL 9.5 and above and only UTF-8:
https://docs.djangoproject.com/en/3.1/ref/databases/#postgresql-notes

Is there any option to solve our Problem? The Old Application should be used til the new Application is ready. I think we need 2 - 3 years for development. At this time it is essential to use both Applications.

Thank you for your reply.
Alex

Regarding the character encodings, I read that section a little differently - it says it need the client encoding to be UTF-8, not the database to be UTF-8. (I’ve used Django to read other databases with different encodings and have not had any problems.)

It really should be a quick test to see if it’s going to work. Start an absolute minimum of a project and see if the admin works with one of the tables.

However, I can’t recommend building a new application using a database that old. It’s not just Django, it’s also compatibility and stability with psycopg2 and any other libraries that might be involved. You really should look at upgrading your database to a currently-supported level (13) and then building your application on it - especially since you’re looking at another 2 years for development.

Your organization also should consider how it’s going to handle routine upgrades in the future. This should be (at a minimum) something seriously looked at every six months. PostgreSQL 9.2 was obsolete 3 years ago.

Thx for fast reply,

where can I set up the database encoding within the psycopg2 connection?

Yes, we are trying to update our old Application to update the Database. Or on the other hand, we can sync the Database to an other Database with newer Version every few minutes. The old App is working with the old DB and the new App is working with the new DB. That are the ideas.

The problem is, if there is no update plan, the updates get more and more complex. Than we come into this situation.

Alex

Directly under the section in the docs you referenced, there a section on Isolation Level which also shows an example of setting OPTIONS within your database connection.

My guess is that you could add something like:

  'OPTIONS': {
      'client_encoding': 'UTF-8',
  },

to your database connection information. (Personally don’t know, have never tried this. That’s just how I read the docs for this.)

Hello, this option I tried just before. The problem stil occurs. I will send the error message if I am back to the office. Maybe there is another Problem.

Just checked the PostgreSQL docs for this - it appears the parameter might be ‘UTF8’, not ‘UTF-8’. I also get the impression from the docs page referred to earlier that Django might also set this itself, that what this would do fits into the category of a performance improvement. (See the last paragraph of that section.)

I tested any things today. I think the Problem is before Django. I created a virtual env and started python. Imported psygopg2 and used an easy sql statement (select * from table;). Below I post my Script and the returned Errors.

>>> import psycopg2
>>> conn = psycopg2.connect("dbname=test_db user=test_user password=password host=127.0.0.1")
>>> cur = conn.cursor()
>>> cur.execute("show server_encoding;")
>>> cur.fetchall()
[('SQL_ASCII',)]
>>> cur.execute("show client_encoding;")
>>> cur.fetchall()
[('SQL_ASCII',)]
>>> cur.execute("select * from tbl_kunde where id = 6837;")
>>> cur.fetchall()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 24: ordinal not in range(128)
>>> cur.execute("set client_encoding='LATIN1';")
>>> cur.execute("select * from tbl_kunde where id = 6837;")
>>> cur.fetchall()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 24: ordinal not in range(128)

The Database is in SQL_ASCII and the Data inside are stored in LATIN1. I know that this is a big problem, but we use a old application developed since 2000 and this is essential for running our business. My part is to redesign the Application (UI, Process, …). But I can’t change the old Data Model at the first time, because the old Application is running beside. Hope you have an idea for the errors. I think its not so difficult, but i have no more ideas.

Did you try setting client_encoding to ‘UTF8’?

Yes I did. I got the same error.

Hello, I did it with a little trick. The psycopg2 Connection could change the Client Encoding with set client_encoding = “Latin1”; but nothing happens. There is an other way to do this.

.../django_env/lib64/python3.6/site-packages/django/db/backends/postgresql/base.py:
function init_connection_state(self):
change the used PostgreSQL encoding for Queries:
called function self.connection.set_client_encoding('LATIN1'):

The function init_connection_state had a hard coded UTF8 encoding. If I change it to LATIN1 it works. Now I must do it in our project, that the change wouldn’t be deleted at an update.

Thanks for your great help.