Initial question for GSoC 2024: Ticket #32519

Hi everyone! I am new to contributing to Django and had already worked on few tickets during Djangonauts session. While searching for more tickets to work on, I came across one that looks very interesting: #32519 (Add support for using key and path transforms in update() for JSONFields.) – Django

It seems that this new feature is very anticipated by many users (based on CC). I do think it will take huge changes even though the scope is already limited to JSON_SET and JSON_REMOVE. I think considering the big changes & efforts, it may be great if we could work on during GSoC. There is one question that I still wonder: can we pick this ticket for proposal? I am not sure whether it could be scoped as 175 hours. Now, GSoC allows to include ~90 hours project, but based on Wiki, it needs to be 175 hours minimum.

Thank you!

I think the lack of mention of the 90 hours project is only because it is new this year. If 90 hours feels like enough then we should be able to accommodate.

Hi @adzshaf, as far as I’m aware as Adam said the 90 hour option isn’t on the wiki just because it’s new. Should be no problem if you make the case that’s the right project size.

Since it’s a project idea of your own, personally I’d recommend to spend some time detailing your proposal a bit here so you can get feedback from potential mentors? You don’t need to draft it all, just perhaps flagging key aspects of the work that will need analysis.

Or another option you could explore is combining this with other tickets or possible improvements that don’t have tickets. Have this ticket as your project’s “main goal”, then add a series of secondary or stretch goals if you’d rather go for the 175 hours project size.

Hope this helps!

Hi @adzshaf, thanks for bringing up this ticket! I gave it a go briefly a while back, and I think I would expect at least 100h to complete the ticket. For beginners, I wouldn’t be surprised if it took more than 120h.

Writing the database functions as Django Func subclasses with the corresponding as_vendor() methods would be the “easy” part. The hard part would be to add support for using key transforms in QuerySet.update() like the examples Mariusz gave on the ticket. There would likely be some gotchas around how the JSON data types are handled on the database, so things like JSON null vs SQL NULL.

That said, with a big project like Django, estimates are hard. I think Thibaud’s suggestion is a good idea. Try to find additional tickets you’d like to work on as stretch goals, and bundle them as a 175h project in your proposal. That way, if you manage to finish the main ticket in under 175h, you still have the stretch goals ready to pick up. If you didn’t have enough time to work on the other tickets, that’d be fine – they’re "nice-to-have"s after all!

I highly recommend drafting up a proposal and sharing it here, so you have a better understanding of the project scope and which other tickets may fit in the project. If the project proposal makes sense I’ll be very interested in mentoring :smile:

Cheers!

2 Likes

Hi everyone!

Thank you for the responses. Here is my proposal that I have been working on: Add support for using key and path transforms in update() for JSONFields

Let me know if there are improvements that I can make. Cheers!

@thibaudcolas @sage

2 Likes

Hey @adzshaf, brilliant! That’s a good start :smile:

I gave it a read, and here’s my feedback (buckle up!):

it would be better if Django supports generics to modify JSONField

I’m not sure “generics” is clear enough in this context. The ticket was originally written about providing generic database functions (Func subclasses) for manipulating JSON data on the database, but it has since been updated to have the ability to do update() with key transforms as the main goal (as evident in the ticket’s title). As this is the “Background” section, perhaps talk about how in order to make partial updates to a JSONField, developers currently need to either:

  • (a) load each model instance into memory, modify the JSONField data as Python objects, then save() the model instance, or
  • (b) write their own mapping for the database system’s JSON functions as Func subclasses, and use them in update(), e.g. update(details=JSONSet("details", some_key="Some value"))

and then, state something along the lines of “It would be better if Django has built-in support for doing partial JSONField updates more efficiently and more idiomatic to how developers can currently query JSONFields (i.e. with key transforms).”

Explaining the current state of things and how it can be improved is a much more convincing way to start your proposal compared to saying “it would be better if…” right off the bat :wink:

The goal of this proposal is to create an implementation for all databases supported that developers can use in order to update values of JSONField.

I believe complete updates are already possible at the moment, e.g. update(details={"some_key": "some_value"}), so I think it helps to be more explicit that we’re trying to support partial updates to JSONField.

The implementation will be limited to update and delete value of JSONField.

What are we limiting this from, i.e. what are the things beyond this limitation? Why are we making this limitation? Perhaps it helps to give more context, e.g.

"The database systems provide several different functions to manipulate JSON data, i.e. JSON_INSERT, JSON_REPLACE, JSON_SET, and JSON_REMOVE. (Explain the functions briefly…) For simplicity and consistent behaviour when doing update() with key transforms, we’ll only use JSON_SET to update a JSONField value at a given path, and JSON_REMOVE to remove the given path from a JSONField.

Or maybe leave this out of the background, and instead explain it in more detail in the “Limitations” section.

With all the above in mind, the “Objective” section needs some adjustments too, e.g. talk about it being partial updates, and rephrasing/dropping the “generic” term in favour of “key transforms support for update() on JSONField”.

Note that update() has yet to implement key and transform.

Not sure if this is a typo, but I noticed several occurrences of “key and transform”. It’s “key transform” or “path transform” (or “key and path transforms” etc.), but “key and transform” does not really make sense :sweat_smile:

Not only making the development easier, the separate JSONSet() also will be useful for complex cases where simple key-transform is not sufficient.
Same problem as mentioned before, due to update has yet to use key and transform, in order to speed up development, we will also create a separate JSONRemove() function:

IMO the main value of implementing JSONSet() and JSONRemove() is the ability to refer to an updated value of a JSONField on the database level, which can also be useful for querying. Yes, it helps with computed paths and also gives a nice abstraction for when we want to implement update() with key transforms, but it’s even better when we can say that we’re building something reusable vs. something “to make our jobs easier” :grin:

Also, in the “Objective” section it doesn’t seem very clear to me that you will also implement key transforms for update(). Maybe it’s just because of how you ordered it, but at the moment it sounds like “because key transforms are not yet supported in update(), we’ll implement the partial update with JSONSet() and JSONRemove() at the top-level of the field (and that’s it)”.

If you intend to implement the partial update via the functions as a first step, with the key transforms in update() as a second step (which seems to be the case based on the timeline), perhaps you can make it clear that this will be the case. That is, explain that you’ll divide the implementation process into two steps: First, implement the functions to allow partial updates to JSONField. Second, add support for key transforms in update() for JSONField to make it easier to use for developers.

The function will be transform to nested query:

If the developer writes

>>> UserPreference.objects.filter(id=2).update(settings__font__size=20, settings__font__name=’Comic Sans’)`

instead of

>>> UserPreference.objects.filter(id=2).update(settings=JSONSet(JSONSet('settings', font__size=20), font__name=’Comic Sans’)

I think we should try to make use of “arbitrary number of key-value pair arguments” on databases that do support it, i.e. SQLite, MySQL, and MariaDB, before falling back to the nested function calls on other DBs.

Note that it will be a different case if a user wants to set the value of a JSON property to be null.

How are you going to handle this case?


As for the timeline, I don’t think we’ll need 1.5 weeks to implement JSONSet() for each DB, especially since there’s already prior work at ticket_32519 by allen-munsch · Pull Request #15422 · django/django · GitHub. Once you’ve written the tests, you can usually reuse them for all DBs, so you’ll “only” need to focus on the as_vendor() implementations.

Also, I’m surprised that you put JSONRemove() after key transforms for update() is implemented for JSONSet. IMO it’ll be easier for you to implement JSONSet() and JSONRemove() first, as they’ll mostly be very similar. Once you get both of them working, you can look into how key transforms can be fitted into update(). I’m quite sure this is the tricky part, as I don’t think Django supports any transforms in update() at the moment, so you’ll have to look into the best place to “wire” this in. It would be nice if this also paves the way for other transforms to be supported for update() in the future, e.g. imagine update(release_date__year=2024).

And finally, what would you do if you manage to complete the project before GSoC finishes? What are your stretch goals?

I think that’s it for now. Again, it’s looking pretty good :+1:
The submission deadline is coming up soon (April 2 - 18:00 UTC), so I’d suggest making the changes sooner rather than later if you’d like another review. Good luck!

Thank you for the feedback! Appreciate it you review it thoroughly!

I already updated here for the latest version (which I will submit for now): Add support for using key and path transforms in update() for JSONFields.

Just wanted to confirm we’ve received your final proposal – thank you for taking it through :slight_smile:

There’s nothing further for you to do. We’ll be reviewing all proposals internally and confirming availability of mentors. Google will announce results on May 1st at 18:00 UTC.