Help wanted: Accept header interpretation

Hi all,

In Django 5.2, a feature was added to improve how Django parses the Accept header.

Since then, some questions have come up about how closely it sticks to the RFC (well, both of them). Specifically in precedence of the provided types. Notably, ticket #36411 (resolved) and #36447. After many conversations with @nessita, we think what we have is likely correct - that specificity takes precedence over the raw quality (q) value, as mentioned in the RFC:

Media ranges can be overridden by more specific media ranges or specific media types. If more than one media range applies to a given type, the most specific reference has precedence.

However, as with all RFCs, the devil is in the detail, and it may be a counter-intuitive implementation for some.

Interested in hearing what other people think of the implementation and how closely it sticks. If anyone knows a domain expert in these lovely RFCs, or similar implementations, I’d be very interested!

“Precedence” in that sentence is not the same as preference. The paragraphs following that sentence clarify exactly what “precedence” is used for: precedence is used to assign the correct quality factors, which are then compared to determine the preference order. Here’s the complete language:

Media ranges can be overridden by more specific media ranges or specific media types. If more than one media range applies to a given type, the most specific reference has precedence. For example,

Accept: text/*, text/plain, text/plain;format=flowed, */*

have the following precedence:

  1. text/plain;format=flowed
  2. text/plain
  3. text/*
  4. */*

The media type quality factor associated with a given type is determined by finding the media range with the highest precedence that matches the type. For example,

Accept: text/*;q=0.3, text/plain;q=0.7, text/plain;format=flowed,
       text/plain;format=fixed;q=0.4, */*;q=0.5

would cause the following values to be associated:

Media Type Quality Value
text/plain;format=flowed 1
text/plain 0.7
text/html 0.3
image/jpeg 0.5
text/plain;format=fixed 0.4
text/html;level=3 0.7

Observe, for example, that the more specific precedence of text/plain;format=fixed causes it to receive a lower quality value than text/plain; this causes text/plain to be preferred over text/plain;format=fixed.

After quite a bit of re-reading this is my interpretation.

To me, a media type must match to the most specific media range or media type that is defined in the Accept header.

This means that given an accept header:

Accept: text/*, text/plain, text/plain;format=flowed, */*

and a media type text/plain;format=fixed, we should associate the quality of text/* rather than */*.

There is then an example to illustrate this in the RFC.


What I am interpret from the RFC that is less explicit is: when given a choice of media types, the media type that is associated with the highest quality is chosen.

In the RFC, there is this example:

Accept: audio/*; q=0.2, audio/basic

is interpreted as “I prefer audio/basic, but send me any audio type if it is the best available after an 80% markdown in quality”.

I assume if I tweak the example, this should be true:

Accept: audio/basic; q=0.2, audio/*

is interpreted as “I prefer an audio type, but send audio/basic after an 80% markdown in quality”.

Meaning that given the choice of audio/basic and audio/mpeg, I prefer audio/mpeg.

HttpRequest.get_preferred_type() returns audio/basic as the preferred type here. This is because audio/basic is a more specific match within the accept header media ranges/types than audio/mpeg but I think the specificity should only be taken into account when determining the quality.

Is there a test suite for the RFC somewhere?

If there is not, maybe a test suite from Apache httpd, nginx, etc could help?

One thing that confused Jake and me is that thinking about specificity and quality precedence is “simpler” for the “same” main type, but for different main types and a set of available media types in the server, is harder. For example, if client sends:

Accept: audio/basic; q=0.2, */*; q=0.7

What would request.get_preferred_type(["audio/basic", "image/png"]) return? We think audio/basic since is more specific, but the ticket report says image/png because it matches a media range with higher q.

I think this might be the crux of the issue, or at least the confusion. The current implementation considers “precedence” as including the specificity - thus when resolving a media type, more specific entries in the Accept header take precedence. I think this is correct.

However, when determining which of a given set of types should be used (ie “preference”, via get_preferred_type), only the quality should be considered (after the above precedence calculation). Currently, Django uses the above rules, where specificity is both included and more important than quality. I think this is incorrect - once the media type is resolved, only the quality should be used to determine preference.

I think this should be image/png, because the specificity should only be considered for “what is the quality value for image/png” rather than “how preferrable is image/png”.

These might be the ramblings of a madman at the end of the working day, but I think @andersk is correct in their reading of the RFC, and by extension #36447. I implemented the above and only a single test failed, which explicitly asserted my previous understanding, rather than any examples from the RFC. Happy to put a PR to resolve #36477 so we can run it through its paces.

As 2 other people who have dove down this rabbit-hole, I’m interested in what you think @sarahboyce and @nessita?

I think the above makes sense and it feels a bit more intuitive. I’ll bump the ticket to be release blocker, and ideally we would release a fix with 5.2.4.

1 Like

I agree :+1: thank you all

I’ve opened a PR to fix our implementation, which should now match both of the RFCs, but also intuition.