Docs seem to go missing (for a given version) when there are docs changes

Today a ticket was created about Search not working for 5.1 on docs.djangoproject.com. I checked and the report was correct, results for 5.0 were showing just fine, while results for 5.1 were all gone (every search for 5.1 was “0 results”).

Prior to the report time, there was a merge in Django main and stable/5.1.x branches with some docs changes. This feels like the docs update triggered some html regeneration/docs search index update that wiped the existing index and it was “blank” until the new index was generated.

Ideally, docs update should not “break” the search capabilities and an index regeneration/update would happen “next to the existing one” (I haven’t had the time to investigate the code).

@pauloxnet I was adviced to ask you about this, would you know more or have more information?

Hi @nessita

Regenerating document indexes for search is done in a transaction, so new vectors are written to the fields only at the end of the update query. The code is here.

Regenerating the index can be invoked together with regenerating the documents if the appropriate argument --update-index is specified. The code is here

There used to be a command that could do index regeneration and update together but it was removed a few months ago. It may have been considered redundant but it ensured that users could regenerate documentation and indexes without fear of forgetting to pass the argument in the command as above

To investigate the reason for the issue you should identify the action that triggers the regeneration of the documentation (I don’t know what the flow is but I think the ops team can help you), and verify that it has the option for the contextual regeneration of the vector indexes that are used by the search.

A couple of weeks ago I opened an issue on the website with the aim of exploiting the generated fields for updating the documentation search indexes, in order to make the process truly atomic because it is executed by the database at the same time as the fields in the tables are modified. Unfortunately the issue is blocked by the website update to Python 3.12 and Django 5.x which are still in progress.

BTW the documentation generation mechanisms have remained unchanged for years, I only dealt with the migration of the indexes from Elastic (where the index update was much slower) to PostgreSQL, where it is almost instantaneous, at least from the tests I can do locally, it could be verified that there are any anomalies in the remote database that hosts the website (to which I do not have access) to look for errors, slow connections or the presence of slow queries.

I think I have given you all the information I have, I hope it was useful.

2 Likes

Thank you, @pauloxnet, for providing all these details. The Fellows have limited availability, so we typically don’t handle maintenance tasks for djangoproject.com.

Could anyone let me know if someone is available to look into this? We just received another report: #35861 (Documentation browser does not work) – Django

I created an issue on djangoproject.com for now :+1:

3 Likes