Stream: implementers
Topic: Charset specification
Josh Mandel (Apr 13 2018 at 20:55):
Our spec says that's clients MUST include ; charset=UTF-8
as part of a content type header when posting data -- but reference servers are happy to accept `Content-Type: application/fhir+json" without the charset specifier. Should our servers be stricter, or should the spec be looser? Or do we like the status quo?
Grahame Grieve (Apr 13 2018 at 21:01):
validating servers should. Not so sure about reference servers
Alexander Henket (Apr 13 2018 at 21:03):
There is a tension between HTTP default behavior (iso-8859-1) and FHIR (everything utf-8). When you say application/fhir+... you sort of imply that content shall be utf-8 regardless of HTTP?
Luis Maas (Apr 13 2018 at 21:47):
The ISO-8859-1 fallback mentioned in RFC 2616 for HTTP 1.1 only applied to "text/..." MIME types, so defaulting to ISO-8859-1 shouldn't have been applicable to "application/..." types like fhir+xml. Also, that fallback has since been explicitly removed, ref: RFC 7231 Appendix B:
| The default charset of ISO-8859-1 for text media types has been
| removed; the default is now whatever the media type definition says.
| Likewise, special treatment of ISO-8859-1 has been removed from the
| Accept-Charset header field.
Luis Maas (Apr 13 2018 at 21:57):
re: "the default is whatever the media type definition says"...
The application/fhir+json and application/fhir+xml definitions both state that if the charset parameter is present, its value must be UTF-8.
This isn't quite the same as saying UTF-8 is the only allowable character encoding, so it might be worth updating of the MIME-type registry to include something to that effect, e.g. UTF-8 is also the default if no charset is specified.
If interested, the definitions are here: https://www.iana.org/assignments/media-types/application/fhir+xml and https://www.iana.org/assignments/media-types/application/fhir+json
Last updated: Apr 12 2022 at 19:14 UTC