Using Django 4.2.5 I came across an issue today where I tried to migrate a CharField to ArrayField(CharField(…)) and it failed because postgres won’t allow you to simply cast a char to an array unless it’s using correct array literal syntax.
I saw this old ticket: #25678 (migrate scalar to array field in PostrgreSQL fails) – Django reporting the same failure. It was originally accepted but then closed by the reporter saying it was apparently fixed way back in 1.9
The issue appears to stem from the addition of doing USING %(column)s::%(type)s
which is still in use today.
Would there be any interest in changing this to USING ARRAY[%(column)s]
which should create an array of 1 element containing the current column’s value - if the new type is array and old type is not an array?
EDIT: I forgot to say the other option which I think would be a nice addition albeit with a lot more work could be to let users specify their own USING
expression with an additional argument onto the AlterField
migration. Though I’m unaware of whether other databases support ALTER … USING
As long as changing back to ARRAY[%(column)s]
doesn’t break the use case why %(column)s::%(type)s
was introduced to prevent data losses I don’t see an issue.
We should confirm that by making sure that migration from a CharField(max_length=10)
to ArrayField(CharField(max_length=5))
crashes if t contains data larger than 10 chars and doesn’t silently truncate it.
We should invest time trying to figure out what is possible before looking into adding support for AlterField(using)
given it’s lack of support across supported backend as its close to RunSQL
territory IMO given users can’t expect Django to be able to automatically convert between all field types it supports.
changing back to ARRAY[%(column)s]
oh I didn’t realise this was previously used?
We should confirm that by making sure that migration from a CharField(max_length=10)
to ArrayField(CharField(max_length=5))
crashes if t contains data larger than 10 chars and doesn’t silently truncate it.
It looks like this is indeed the case as long as we don’t cast the result:
test=# create table foo (foo varchar(10));
CREATE TABLE
test=# insert into foo values ('abcdefghij');
INSERT 0 1
test=# alter table foo alter foo type varchar(5)[] using foo::varchar(5)[]; -- this is current behaviour
ERROR: malformed array literal: "abcdefghij"
DETAIL: Array value must start with "{" or dimension information.
test=# alter table foo alter foo type varchar(5)[] using Array[foo]; -- simply using the Array constructor results in error
ERROR: value too long for type character varying(5)
test=# alter table foo alter foo type varchar(5)[] using Array[foo]::varchar(5)[]; -- casting the output from the constructor proceeds without error
ALTER TABLE
Also I just would like to point out that with the current behaviour it’s possible that if the string resembles the array literal syntax, there’s “data loss” when the array tokens are consumed… dunno how likely this scenario would be but it is possible
test=# create table foo (foo varchar);
CREATE TABLE
test=# insert into foo values ('{abc}');
INSERT 0 1
test=# alter table foo alter foo type varchar[] using foo::varchar[];
ALTER TABLE
test=# select foo[1] from foo; -- this shows that the "{" & "}" chars were consumed when converted to an array
foo
-----
abc
(1 row)
We should invest time trying to figure out what is possible before looking into adding support for AlterField(using)
given it’s lack of support across supported backend as its close to RunSQL
territory IMO given users can’t expect Django to be able to automatically convert between all field types it supports.
My thinking was that AlterField(using=...expression...)
could relieve the burden on Django and let users handle it
(In my particular case I can’t use RunSQL (well “easily”) because this project is using django-pgtrigger which monkey-patches migrations and wraps schema updates in a temporary trigger drop & recreate. )