FHIR Chat · URL Encoding Question · implementers

Stream: implementers

Topic: URL Encoding Question


view this post on Zulip Justin Stauffer (Jun 28 2018 at 20:08):

I have a question about URL encoding special characters in a FHIR request. For example, let's say I'm performing a search on a Condition resource and I want to include Conditions with clinical-status of "active" or "resolved", are these two URLs equivalent? "GET [base]/Condition?clinical-status=active,resolved" and "GET [base]/Condition?clinical-status=active%2Cresolved"? That is, should the %2C be treated the exact same as an un-encoded comma?

Looking at the "escaping" portion of the search spec, it isn't entirely clear to me (https://www.hl7.org/fhir/search.html#escaping). The examples there imply the two are equivalent but I am not sure if the examples are normative.

view this post on Zulip Grahame Grieve (Jun 28 2018 at 22:11):

we don't say much about this because HTTP rules apply, and under those rules, the URLs are equivalent

view this post on Zulip Justin Stauffer (Jun 29 2018 at 14:24):

OK, sounds good. Thanks for clarifying.

view this post on Zulip Justin Stauffer (Jun 29 2018 at 15:24):

Actually, hang on a second... We looked into this further and we're not sure that HTTP rules say the URLs are equivalent. Specifically, RFC 3986 section 2.2 (https://tools.ietf.org/html/rfc3986#section-2.2) defines the set of reserved characters to include "," and then states:

The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI. URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent. Percent-encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications. Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI.

That sounds like a URL with , is not equivalent to a URL with a percent-encoded version of %2C.

view this post on Zulip Grahame Grieve (Jun 29 2018 at 19:06):

I was too tired... we define that those URLs are equivalent in that section (see last example)

view this post on Zulip Justin Stauffer (Jun 29 2018 at 20:08):

Is that permissible for us (HL7/FHIR) to override the URI RFC?

view this post on Zulip Grahame Grieve (Jun 29 2018 at 20:32):

I don't believe that we override anything; just made some scheme specific use notes

view this post on Zulip Justin Stauffer (Jul 02 2018 at 13:58):

It just seems that the URI RFC (3986) is somewhat clear in saying that reserved characters that are percent-encoded are not equivalent to the reserved character itself. Then we're saying "Yes they are equivalent." How is that not overriding the rule in the RFC?

view this post on Zulip Grahame Grieve (Jul 02 2018 at 15:29):

um, we're not. We're concerned with the interpretation of escaped characters - they're escaped either way, but we have a second level of escaping

view this post on Zulip Justin Stauffer (Jul 09 2018 at 15:14):

Hi Grahame, sorry I'm not quite able to parse what you're saying in that last response. Do we agree that the URI RFC (3986) rule I quoted above means that these URLs are not equivalent (solely according to that RFC, not our FHIR specification)?: "GET [base]/Condition?clinical-status=active,resolved" and "GET [base]/Condition?clinical-status=active%2Cresolved"

view this post on Zulip Justin Stauffer (Jul 09 2018 at 15:23):

Also, can we be clear on the terminology we're using here and say "percent-encoded" like RFC 3986 uses when we're talking about changing a comma to %2C? "Escaping" as defined in the FHIR specification deals with prepending special characters with the '\' character. I'm not really talking about "escaping" here -- there just happens to be a mention of percent-encoding in that portion of the FHIR spec and I feel like that's causing some confusion here.

view this post on Zulip Grahame Grieve (Jul 16 2018 at 20:51):

GF#17464 - @Ewout Kramer prompt for you

view this post on Zulip Grahame Grieve (Jul 16 2018 at 20:54):

it would help me understand what the problem is here better if rfc 3986 defined what 'equivalence' means in that context. I presume that we should not think that they mean that if 2 URIs are not equivalent, they cannot return the same outcome

view this post on Zulip Ewout Kramer (Jul 17 2018 at 08:51):

I think we are confusing two types of escaping. One is the encoding described in RFC3986 to turn FHIR search parameters and values into URLs, the other is within the FHIR spec to differentiate between two possible ambiguous values. Section https://www.hl7.org/fhir/search.html#escaping is about the latter:

  • GET [base]/Condition?clinical-status=active,resolved means get me conditions with the clinical-status "active" OR the clinical status "resolved"
  • GET [base]/Condition?clinical-status=active\,resolved means get me conditions with the clinical-status "active,resolved" (so, the status includes a comma as a character - uncommon but possible).

For sure, you might still need RFC3986 to get the characters in the resulting FHIR search strings shown above across on a URL, but that's something on the http/interchange level.

To put it differently: even if we would switch from http (and url encoding) to another transport mechanism, we would still need the distinction between search string "clinical-status=active,resolved" and "clinical-status=active\,resolved".

view this post on Zulip Justin Stauffer (Jul 19 2018 at 20:45):

@Grahame Grieve I agree. :) @Ewout Kramer thanks for the clarification. It would be nice if the FHIR specification explained this more solidly. The way the https://www.hl7.org/fhir/search.html#escaping section is right now, this logic is implied instead of spelled out in a clear manner (at least for me so maybe that's my fault).

view this post on Zulip Grahame Grieve (Jul 19 2018 at 22:37):

ok I updated the task with a pointer to this thread and a note to add more explanation

view this post on Zulip Grahame Grieve (Aug 07 2018 at 22:54):

ok updated the section in question at build.fhir.org/search.html#escaping


Last updated: Apr 12 2022 at 19:14 UTC