FHIR Chat · UUID to lower-case() · implementers

Stream: implementers

Topic: UUID to lower-case()


view this post on Zulip Alexander Henket (Jan 19 2018 at 10:40):

In HL7 version 3 datatypes 1 and 2 there was a requirement to send upper-case UUIDs. In FHIR this requirement has shifted to lower-case UUIDs. The UUID 'standard' marks UUIDs as case insensitive.

The HL7 V3 and FHIR requirements to do case conversion both represent a problem. The requirement for a UUID should have been that once it is created, it shall be an immutable string and shall not be converted to upper or lower in any way.

I'm writing this from real world experience where all communication with our national "CDC" like institute is broken based on the case conversion requirements of UUIDs.

What were the reasons behind specifying V3 and FHIR this way? @Grahame Grieve / @Lloyd McKenzie / @Marc de Graauw

view this post on Zulip Grahame Grieve (Jan 19 2018 at 10:44):

it's a running problem in v3 too. But we inherited lowercase from RFC 4122 (https://tools.ietf.org/html/rfc4122), which says: The hexadecimal values "a" through "f" are output as
lower case characters and are case insensitive on input

view this post on Zulip Alexander Henket (Jan 19 2018 at 10:47):

Just below there it also says:

The formal definition of the UUID string representation is
provided by the following ABNF [7]:

  UUID                   = time-low "-" time-mid "-"
                           time-high-and-version "-"
                           clock-seq-and-reserved
                           clock-seq-low "-" node
  time-low               = 4hexOctet
  time-mid               = 2hexOctet
  time-high-and-version  = 2hexOctet
  clock-seq-and-reserved = hexOctet
  clock-seq-low          = hexOctet
  node                   = 6hexOctet
  hexOctet               = hexDigit hexDigit
  hexDigit =
        "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
        "a" / "b" / "c" / "d" / "e" / "f" /
        "A" / "B" / "C" / "D" / "E" / "F"

view this post on Zulip Alexander Henket (Jan 19 2018 at 10:48):

That appears to include upper case A-F too right?

view this post on Zulip Grahame Grieve (Jan 19 2018 at 10:49):

it does. I don't know why since it clearly says to use lower case chars

view this post on Zulip Alexander Henket (Jan 19 2018 at 10:52):

I would propose, not to impose any case conversion in FHIR (or V3, but that ship has sailed). If real world databases generate A-F and a-f, then we break those and that just seems wrong

view this post on Zulip Grahame Grieve (Jan 19 2018 at 10:53):

I think the ship has sailed in fhir too. but what does "case insensitive on input" mean?

view this post on Zulip Grahame Grieve (Jan 19 2018 at 10:53):

are there real world databases that generate invalid urns?

view this post on Zulip Alexander Henket (Jan 19 2018 at 10:54):

No idea. I only use the result of the input, i.e. the UUID. The input (in think) is all things you consider in generating the UUID?

view this post on Zulip Grahame Grieve (Jan 19 2018 at 10:54):

also: "he requirement for a UUID should have been that once it is created, it shall be an immutable string and shall not be converted to upper or lower in any way." - where do you get that from? The rules are more complex than that - in a given URI, some parts may or may not be case sensitive

view this post on Zulip Alexander Henket (Jan 19 2018 at 10:54):

There are real world databases that generate UUIDs. If you convert them to lower-case/upper-case, there is no way to get back to the original UUID

view this post on Zulip Alexander Henket (Jan 19 2018 at 10:55):

If you cannot get back to the original UUID, you get mismatches, or really ineffcient matching

view this post on Zulip Grahame Grieve (Jan 19 2018 at 10:55):

I think the point is, they are case insensitive, and should be treated accordingly; that's how I understand the rfc: write lower case, accept either case

view this post on Zulip Alexander Henket (Jan 19 2018 at 10:56):

That sounds to me as if the urn is de facto case insensitive too and that there is no need for FHIR to require lower-case

view this post on Zulip Grahame Grieve (Jan 19 2018 at 10:57):

I don't follow that.

view this post on Zulip Alexander Henket (Jan 19 2018 at 11:01):

Oracle/Java supports both cases https://docs.oracle.com/javase/8/docs/api/java/util/UUID.html (see method toString)

view this post on Zulip Alexander Henket (Jan 19 2018 at 11:04):

I'm just saying that FHIR should not impose the lower-case (or upper-case). I do not read that requirement in the RFC either. Any UUID can have both upper and lower case letters and that does not change when you communicate it in a urn.

view this post on Zulip Alexander Henket (Jan 19 2018 at 11:07):

Breaking my argument somewhat I also found this link: https://stackoverflow.com/questions/34585957/postgresql-9-3-how-to-insert-upper-case-uuid-into-table#34586013

view this post on Zulip John Moehrke (Jan 19 2018 at 13:12):

forced convert to urn:oid.. joking...

view this post on Zulip John Moehrke (Jan 19 2018 at 13:15):

Given that everyone must process a UUID in case insensitive ways... therefore there is no advantage or need to force a conversion of case... and that forcing, as Alex points out well, is problematic. I would be against a requirement to force the case.

view this post on Zulip Ewout Kramer (Jan 19 2018 at 13:19):

"A universally unique identifier (UUID) is a 128-bit number (...)"

So, I think equality of UUIDs is based on their 128-bit value, and differences in the (canonical) serialization format are immaterial. This also means you are not "changing" the UUID in any way when you change lower to upper case. And we could still tell everyone that in FHIR, we want you to send it in lowercase.

view this post on Zulip John Moehrke (Jan 19 2018 at 13:44):

I agree on the equality (hence why urn:oid conversion is also acceptable)... However by forcing a case change, you are forcing unnecessary work; and in some cases creating a failure-mode. What is the advantage in FHIR for forcing lowercase?

view this post on Zulip Grahame Grieve (Jan 20 2018 at 05:21):

my read of rfc 4122 is that is requires lower case: "The hexadecimal values "a" through "f" are output as lower case characters". That's why we specifiy lower case. It's true that the grammar allows uppercase, but the text doesn't.


Last updated: Apr 12 2022 at 19:14 UTC