Whilst there are many possible implementations and suggestions how to interpret HTTP-quality-values / -Q-factors floating on the internet, I was unable to find 'correct' interpretations for several cases and why they should be interpreted this way. This would be mandatory for creating a "bulletproof" parsing mechanism.
For example, the MDN-Documentation lists following examples:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
would be read as follows:
| value | priority |
|---|---|
| text/html, application/xhtml+xml | 1.0 |
| application/xml | 0.9 |
| */* | 0.8 |
It seems clear to me, that according to RFC 7231 (HTTP/1.1), the first argument would be qualified as q=1.0 due to defaults ('(no value is the same as q=1)'), at least it seems to be true for the second argument, as well, without clearly stating that this is due to the defaulting.
Furthermore it is totally unclear for me, how the following constructed statement should be parsed:
text/html,application/xhtml+xml,application/xml;q=0.9,text/plain,image/png;q=0.95,text/html;q=0.5,image/jpeg;q=0.99,image*;q=1.5
Keeping aside the obvious pointlessness of this statement, it leads to several problems:
- Should you consider this totally invalid, due to the several imperfections of this call, even partially breaking the standards at all (q>1 / q<0 is not allowed)?
- For example, both MDN: Accept-Language and RFC7231 5.3.5 Accept-Language state, it might be okay to reject the request using HTTP/406 Not Acceptable - but advises against it due to usability reasons.
- Should you expect text/plain to be q=1.0 due to its qualification not specified, even though its between two arguments not defined as q=1.0? Or "should" you process it with some kind of state machine, transfering over former values, so it becomes q=0.9?
- How should you respond to conflicting information? (text/html has both q=1.0 and q=0.5) Should it get overwritten, or is it just "another" entity in the list, resulting in a duplicate?
- How does the qualifier affect the order of preference, when the request could be fully satisfied by the server in all arguments, but it is provided in non-descending or even random order? I would assume, based on the resources given so far, to be expected to sort descending by the q-Value. However, the second example on the MDN page leaves this up to debate.
The latter example lists as follows:
text/html;q=0.8,text/*;q=0.8,*/*;q=0.8
which would expand to
| value | priority |
|---|---|
| text/html | 0.8 |
| text/* | 0.8 |
| */* | 0.8 |
In this example, every value has the same q-factor, which yields the question whether the application "should" sort such statements from the most specific to the least specific, or whether it "should" be kept in the order of declarance. If so, what purpose would and could the qualifier serve, especially as - according to the declaration - every content would be "accepted" after all? I assume, when stated on MDN, the example would make sense in some way and not just be hypothetically, but I am kind of uncertain in this case. Most examples out there simply sort by the qualifier, which definitively would result in unexpected behaviour in this scenario.
And as MIME-types tend to consist of two components, according to RFC 2046, any reorder by specificality would have to consider at least two dimensions; this would lead to potential unexpected behaviour for permutations like
text/html;q=0.8,text/*;q=0.8,*/html;q=0.8,*/*;q=0.8
Also, it is unclear to me, whether I should expect parameters to the arguments, like for example
text/html;charset=utf-8;q=0.8,text/plain;charset=utf-8;q=0.8,text/html;q=0.5
as the MDN page for quality values states that there are headers like accept handling additional specifiers resulting in definitions like text/html;level=1, but not going further into detail about that. Considering the RFC7231 ABNF definitions, it seems at least possible, leading to the suggestion that an application developer shouldn't rely on matching solely for only some kind of q=<potential floating point number [\.0-9]+> representation and expecting everything remaining a potential request argument.
There might be solutions for all these hypothetical problems which feel "natural" in some way, however I am unable to find any reliable clues confirming them.
This all concludes to the following question:
Is there even a right way intented, or is this left open to the application developer? If so, how should an application server react to the stated scenarios and why / where is it documented?
As far as I can see in my research so far, the topic seems to be rather sparse in the RFCs as well as the browser documentation, which might conclude it is designed to be more of a tool than a ruleset. But I am unsure if I am not missing out something "self-evident", as I am not aware of every RFC ever published.
As far as I can tell, the RFC and the MDN example are consistent, no?
Regarding your example:
This parses into
where the last element is an invalid media range. It's up to you whether you want to ignore just that element or the complete header field.