Stream: implementers
Topic: Character set in URLs
Grahame Grieve (May 03 2016 at 05:55):
several implementers have asked us about the use of non-ascii characters in URLs when making FHIR requests
Grahame Grieve (May 03 2016 at 05:56):
as a background, we reference rfc 3986 for the URI specification, which defines % encoding, but says that the character encoding in % encoding is not fixed. It says that it can be fixed by specific protocols.
Grahame Grieve (May 03 2016 at 05:56):
rfc 3987 fixes it to UTF-8 for IRIs but IRIs are not URIs.
Grahame Grieve (May 03 2016 at 05:57):
there is no official rule for the HTTP specification - *it is not fixed*
Grahame Grieve (May 03 2016 at 05:57):
browser behaviour varies; newer browsers use UTF-8, older legacy ones (like IE11) use 8859-1 (we think)
Grahame Grieve (May 03 2016 at 05:58):
browsers tend to follow the resource encoding from the server, if they have seen that
Grahame Grieve (May 03 2016 at 05:58):
for more information, see here: http://stackoverflow.com/questions/912811/what-is-the-proper-way-to-url-encode-unicode-characters
Grahame Grieve (May 03 2016 at 05:59):
given that all resources are UTF-8, we believe that for the purposes of FHIR, clients SHOULD encode % encodings using UTF-8, and servers SHOULD interpret the encoding as UTF-8.
Grahame Grieve (May 03 2016 at 05:59):
I am updating my server now to fix the decoding to use UTF-8 not 8859-1.
Grahame Grieve (May 03 2016 at 06:00):
please let us know if you have a client or server for which you cannot ensure that UTF-8 is in use (if you use extended characters)
Grahame Grieve (May 03 2016 at 06:07):
gForge task: http://gforge.hl7.org/gf/project/fhir/tracker/?action=TrackerItemEdit&tracker_item_id=9949
Grahame Grieve (May 03 2016 at 06:17):
btw: unicode characters will only arise in the parameter values when searching
Last updated: Apr 12 2022 at 19:14 UTC