Optionally Do Not Drop Content-Length for HEAD Requests

Currently, Django in runserver drops the Content-Length header for HEAD requests. django/django/core/servers/basehttp.py at main · django/django · GitHub

This was done because HEAD request headers should match GET request headers and the Content-Length header does not match because HEAD request body length is always zero. That particular header is “optional” for HEAD requests and thus the easiest implementation path at the time was to drop Content-Length headers.

https://github.com/django/django/pull/16502#issuecomment-1405617633

To implement certain streaming protocols like HTTP Range Requests (HTTP range requests - HTTP | MDN) the Content-Length header needs to be returned in the HEAD requests.

My proposal is to only drop Content-Length if it is 0. If it is non-zero that means it was set purposely and should be allowed through. If Content-Length is purposely set to zero that is probably an invalid use case for a HEAD request because that means your GET request is essentially a HEAD request and you don’t need both requests then.

Also note, I tested other servers like uvicorn and the header is returned properly with that ASGI server when implementing HTTP Range Requests.

CC: @felixxm

Hi @pizzapanther, welcome! :wave:

We should probably ping @sarahboyce and @ngnpope, who were involved in the discussion for the implementation there.

I don’t know what’s at stake here but will just say in passing that, the development server doesn’t have to support every possible use-case. If something is quite niche, we can say Use a production-grade server in that case. (Not saying that applies here either :sweat_smile:)

Basically developing a streaming response forces you to use a production-grade server. So definitely can work around it but not the ideal development experience.

I wouldn’t consider HTTP Range Requests “niche”. It’s a pretty standard HTTP protocol; however, Django hasn’t been well suited for it until it became more async. Now that Django is more async, it makes sense to do it in Django.

1 Like

But for streaming responses (under ASGI, which is the point if you’re saying “async”) you’re not using Django’s development server anyway… :thinking:

The development server lets you run async views, although it is a sync to async conversion. So not ideal for performance but good enough for development purposes.

OK, that’s starting to seem pretty niche to me… The Streaming Responses docs are pretty clear that you need to adopt your response type to whether you’re using ASGI or not, and Daphne provides a drop-in runserver replacement for just this development need. So, I’m going to ignore all that, and want the WSGI dev server to support my use-case isn’t necessarily sounding that convincing… — maybe taking a step back and explaining why this needs to be the case would help.

But in any case, I think letting others, who are more informed on why the decision was made as it is, would serve better immediately. (Sorry if my thoughts are a distraction — it was just an aside based on the fact that the conversation on the PR was so extensive :christmas_tree:)

Niche or not, the RFC is imo pretty clear:

The server SHOULD send the same header fields in response to a HEAD request as it would have sent if the request method had been GET. However, a server MAY omit header fields for which a value is determined only while generating the content. For example, some servers buffer a dynamic response to GET until a minimum amount of data is generated so that they can more efficiently delimit small responses or make late decisions with regard to content selection. Such a response to GET might contain Content-Length and Vary fields, for example, that are not generated within a HEAD response. These minor inconsistencies are considered preferable to generating and discarding the content for a HEAD request, since HEAD is usually requested for the sake of efficiency.

So while we are allowed to not generate a Content-Length header when it is harder than neccessary there is not good reason to delete one. IMO this should be fixed in Django.

2 Likes

Yes! But there was that long discussion on the PR, hence wanting to see what Sarah and Nick had to say…

You can implement a streaming response synchronously too, so this broken there too. There are actually a few synchronous HTTP Range Request libs in Django this breaks. I’m just saying Streaming Responses will become more popular with async.

As @apollo13 stated. Basically, dropping the Content-Length header all the time is an improper way of handling HEAD requests. If the developer explicitly sets Content-Length it is unexpected behavior to drop it. Additionally, keeping your dev server and prod server functioning as much the same is ideal.

1 Like

proposed fix would be one line.

def cleanup_headers(self):
        super().cleanup_headers()
        if (
            self.environ["REQUEST_METHOD"] == "HEAD"
            and "Content-Length" in self.headers
            and str(self.headers["Content-Length"]) ==  "0"  ### this line added ###
        ):
            del self.headers["Content-Length"]
2 Likes

I found the discussion that Carlton referred to in https://github.com/django/django/pull/16502 , where @ngnpope wrote up a more detailed history of the RFCs then said:

So we have two options for HEAD: Send Content-Length with the correct value of the response content or omit the Content-Length header entirely. Frankly, the easiest and most consistent is to do the latter. And we should certainly fix that use of zero as the value…

The PR went with the second, easiest, most consistent option, whilst the proposal here is to swap to the first option.

I am leaning pro-change after reading this. I don’t foresee any huge problems, although I suppose in cases where the content is highly dynamic, the length would vary frequently. But I guess that is a problem with the “HEAD then GET with range” pattern in general.

1 Like

But I guess that is a problem with the “HEAD then GET with range” pattern in general.

Probably, though range is rather special anyways and usually (?) only used for large responses (videos for video players?).

One option to make this “safer” is to see whether we can determine if the user manually set the content-length as opposed to our handler setting it and only delete in the later case.

But in the end, this code affects runserver which shouldn’t be used in production, so that is why I envision basically zero issues with it. And especially since HEAD is basically a performance optimization in this case it would be nice to support it.

adding the line and str(self.headers["Content-Length"]) == "0" basically does the check to see if the user set the header. Because the header will be 0 unless the user changes it.

Range is also used in download managers to resume interrupted downloads. So yes it is used in video but resumeable downloads is the ancient use case.

Hello :wave: sorry it’s taken me a while to get back into the PR.
I think let’s make the change.
I had no test case on the user setting the Content-Length header which usually means I hadn’t considered it, rather than this is a deliberate consequence.

Should I reopen the ticket? #35051 (HEAD Responses Drop Headers) – Django

I can make the contribution too. Seems like an easy enough first contribution.

I think so. Link this discussion.
Please do send a patch! Would also need a test and maybe a release note :thinking:

1 Like