FHIR Chat · Can you have | in the code? · terminology

Stream: terminology

Topic: Can you have | in the code?


view this post on Zulip Yunwei Wang (Jul 18 2017 at 20:18):

According to code data type definition, | character is allowed in code. Then the token search parameter uses | as separator between the system and code. So will that cause difficulty on server to decide the purpose of a | character?

view this post on Zulip Grahame Grieve (Jul 19 2017 at 20:41):

yes that would cause difficulty, for sure.

view this post on Zulip Yunwei Wang (Jul 19 2017 at 20:42):

Should I create a tracker?

view this post on Zulip Grahame Grieve (Jul 19 2017 at 20:43):

it's not completely unresolvable, in that what's on the left of the first | must be a URI if the first | separates the system. It's pretty unlikely that a real code will have a | with a real URI on the left...

view this post on Zulip Grahame Grieve (Jul 19 2017 at 20:43):

not that we really like 'unlikely'

view this post on Zulip Grahame Grieve (Jul 19 2017 at 20:43):

you can create a tracker, but what would we do with it? I don't immediately see an obvious fix, other than to document the issue

view this post on Zulip Lloyd McKenzie (Jul 20 2017 at 02:10):

The problem is that HL7 doesn't have authority over the world's code systems.

view this post on Zulip Lloyd McKenzie (Jul 20 2017 at 02:11):

Though we have actually imposed other constraints on code - such as how much whitespace they contain, so in principle, we could make other constraints too.

view this post on Zulip Rob Hausam (Jul 20 2017 at 02:30):

A pair of '|' characters are used to delineate the term in SNOMED CT Compositional Grammar, which certainly can be used in 'code' - I doubt that that we realistically can or would want to constrain that out. Likely we can normally recognize and separate the patterns within the string, as Grahame suggests, but I also agree that may not be sufficient. I think we need to find a more reliable solution, but I'll have to give that some further thought.

view this post on Zulip Grahame Grieve (Jul 21 2017 at 11:11):

we do not allow the term in the expression for snomed codes because we do not allow codes to contain spaces

view this post on Zulip Lloyd McKenzie (Jul 21 2017 at 20:07):

Actually, we do allow spaces in codes (just not consequitive ones). We prohibit term in snomed codes because it breaks computational interoperability and because FHIR has a different place to put the human-readable component.

view this post on Zulip Peter Jordan (Jul 22 2017 at 00:34):

The table at Section 4.2.1.0.1 of the R3 Spec http://hl7.org/fhir/snomedct.html lists the SCT artifacts that can be placed in a code element - these are Concept IDs, Expressions, Compositional Grammar and Legacy Codes. It explicitly states that expressions should not contain Terms (which go in a Description ID extension). However, the example of a Query Expression used as a filter in Section 4.2.1.0.6.3 suggests that an expression using a term is permissible in that situation; although that may be just to make the example (more) readable and it's probably not good practice to bloat a request with terms.

view this post on Zulip Lloyd McKenzie (Jul 22 2017 at 21:03):

The issue is less with bloat and more with systems that don't parse the expressions and do simple string-matches.

view this post on Zulip Lloyd McKenzie (Jul 22 2017 at 21:04):

Probably the majority of systems that process SNOMED codes won't parse them and will treat them to how most systems treat UCUM codes.

view this post on Zulip Peter Jordan (Jul 23 2017 at 01:52):

Not sure that I follow your reasoning with regard to filtering implicit SCT value sets by using a query expression. This is likely to be a request issued to a Terminology Server which may well know how to parse such an expression, with or without terms. As for placing (post-coordinated) SCT expressions in codes, one would hope that both parties in the relevant exchange have indicated that they have the capability to process them!

view this post on Zulip Lloyd McKenzie (Jul 23 2017 at 14:43):

How many systems use UCUM? And what percentage of those systems parse "g/mL" into its constituent parts vs. just looking up that it's a legal value in a table (and perhaps determining a conversion factor). There will be a lot of systems (probably a majority of systems) that will work with SNOMED the same way. They'll treat SNOMED codes as a string and will treat code matching as a string matching exercise. Including display terms in the expression will cause that string matching to fail.

view this post on Zulip Peter Jordan (Jul 23 2017 at 23:29):

I'd still maintain that the real differentiator is whether systems can handle SCT expressions, per se, in code elements. That's certainly one of the major challenges for SCT implementations here in NZ - very few RDBMS are currently designed to persist lengthy expressions in code fields. No problem with not permitting terms in SCT expressions placed in FHIR coded elements - but the broader concern is whether a receiving system can handle anything other than an individual code.

view this post on Zulip Lloyd McKenzie (Jul 23 2017 at 23:48):

That's really about length limits. The more qualifiers you have, the more likely to hit the length limits systems allow for codes. (If you're over 255 characters, you're out of luck pretty much everywhere). But if you're approaching that length and you're not putting the display name in the code, then you've probably gone way over the top in terms of post-coordination.

view this post on Zulip Rob Hausam (Jul 24 2017 at 16:05):

The actual documentation is not as restrictive as what we've been saying here. The summary row for Code in table 4.2.1.0.1 that @Peter Jordan referred to has the link to the SNOMED CT Compositional Grammar Specification and states that it is a valid artifact for use in the 'code' element, and the CG specification document explicitly DOES include the |xxx| syntax for inclusion of terms within the expression (as I mentioned before). Plus, the last sentence in that row does say "Expressions SHOULD NOT contain terms" - but if we're interpreting that as formal conformance language (which by using all caps appears to be the intent) it says "SHOULD NOT" rather than "SHALL NOT", so that means that we would prefer that you don't include terms in SNOMED CT expressions that are used to populate the 'code' element, but you are able to do that if you want to (for whatever reason) and will still be conformant.

view this post on Zulip Lloyd McKenzie (Jul 24 2017 at 20:23):

I'd definitely prefer "SHALL NOT" - including the display value causes grief. With UCUM, we say that implementers SHALL use the case-sensitive variant, and I'd say that forcing the stripping of display names here is a similar sort of imposition - something that won't impact many and is easily automatable by sending systems.

view this post on Zulip Rob Hausam (Jul 24 2017 at 22:24):

I sympathize with Lloyd's conerns, but I'm still not sure if we want to dictate and restrict what would be otherwise perfectly valid content and behavior with a published SNOMED CT specification which we state that we support (and I think that would apply similarly with any other code system and specified usages). If you are attempting to do string matching with SNOMED CT compositional expressions (which is probably generally a rather bad idea), even without the display terms you may have to do some kind of normalization of the strings to at least deal with variations in white space, etc. (not to mention ordering, which probably makes the idea actually intractable). If you do that, you should also be able at the same time to strip out the '|' characters and the content between them. Interestingly, the "white space" rule that Lloyd is referring to only applies to the 'code' date type, as far as I can tell. There we state that "Technically, a code is restricted to a string which has at least one character and no leading or trailing whitespace, and where there is no whitespace other than single spaces in the contents". But for Coding (and therefore also CodeableConcept) it doesn't mention whitespace and says that "If present, the code SHALL be a syntactically correct symbol as defined by the system. In some code systems such as SNOMED CT, the symbol may be an expression composed of other predefined symbol (e.g. post-coordination). Note that codes are case sensitive unless specified otherwise by the code system." For the UCUM case, I'm not sure that there is any simple "filter" or transform between the case-sensitive and case-insensitive symbols, so that would be potentially even more of an interoperability issue and it does seem to make sense to explicitly specify the case-sensitive symbols. We do state that "Comparison between codes is always case sensitive", but even there we still add the caveat of "unless the codes are selected by reference (e.g. ValueSet.compose), and the referenced specification clearly states otherwise". So, getting back to SNOMED CT, I think it would probably be best to leave the specification as it currently is and allow the presence of terms delimited by '|' characters within valid SNOMED CT expressions to be used in Coding.code if that's what people want to send.

view this post on Zulip Lloyd McKenzie (Jul 24 2017 at 23:08):

Coding.code has type code, so the same rules apply.

view this post on Zulip Lloyd McKenzie (Jul 24 2017 at 23:09):

Our purpose is to define rules for interoperability. We define rules for how social security numbers are to be transmitted. I don't see anything different about setting rules for the transmission of SNOMED codes. I agree that defining canonicalization for post-coordinated expressions would be useful, though that gets tricky as it could potentially change the way the display value renders.

view this post on Zulip Lloyd McKenzie (Jul 24 2017 at 23:10):

(All the more reason to send CodeableConcept.text...)

view this post on Zulip Rob Hausam (Jul 25 2017 at 00:43):

Yes, it's obviously true that Coding.code has type code and that the same whitespace rule would apply - I should have thought of that. But I don't think it actually affects the rest of the discussion, as that doesn't involve whitespace. I agree that we can set rules for transmission of SNOMED CT codes - we just need to decide if we need more or different rules than we already have. The other issue to consider is what is the 'display' for a SNOMED CT compositional expression? The spec says "SNOMED CT does not define displays for expressions; if no display has been associated with the expression through a value set or other mechanism, the full expression syntax with preferred terms embedded may be used". So in the latter case if we have made the "no terms" rule for 'code' a SHALL and we still want to provide the 'display' for a compositional expression that isn't associated with a display string through a value set or otherwise, then we will probably end up having the "full" version of the compositional expression with both the codes and terms in 'display' and then will also have the more stripped down version with only the codes in 'code'. That is certainly doable, but it seems rather redundant and probably unnecessary to me.

view this post on Zulip Lloyd McKenzie (Jul 25 2017 at 02:08):

The stripped down version is necessary for computation. Whether a display is set I'm not terribly fussed about - though including text is definitely a good idea.

view this post on Zulip Rob Hausam (Jul 25 2017 at 02:45):

I agree it's necessary for computation - the question is when and where it can or must be done. I'm not sure how much more (if any) actual interoperability we would gain by trying to enforce doing it here.

view this post on Zulip Lloyd McKenzie (Jul 25 2017 at 04:32):

I can't see that it makes sense to do it anywhere other than when the instance is created. The source system is the system most likely to know how to trim the code (or to not bother putting the display values in in the first place). Allowing them creates the same issues we prohibit when we prevent case-insensitive UCUM.

view this post on Zulip Grahame Grieve (Jul 25 2017 at 06:39):

we can say that you SHALL not send the term. But I think that would be going to far - what are people supposed to do when they are already storing the term?

view this post on Zulip Grahame Grieve (Jul 25 2017 at 06:40):

I think that's it better not to, but it's not clear to me that we should stop people from doing it.

view this post on Zulip Grahame Grieve (Jul 25 2017 at 06:40):

I think this is something we should take up with Snomed Intl

view this post on Zulip Rob Hausam (Jul 25 2017 at 09:34):

That makes sense to me. It would be good to further discuss and work through the implications with SNOMED International.

view this post on Zulip Lloyd McKenzie (Jul 25 2017 at 13:40):

If you're already storing the term, you can still strip it out on transmission, no?

view this post on Zulip Lloyd McKenzie (Jul 25 2017 at 13:46):

I don't really understand why we'd treat this differently than we do case-sensitive vs. case-insensitive UCUM. (Actually converting between case sensitive/case-insensitive is more work as you need a mapping. With expressions in SNOMED, you can strip them out without understanding the codes at all.

view this post on Zulip Grahame Grieve (Jul 25 2017 at 21:06):

it's not the same - case sensitive and case insensitive are different codes. This would be more like insisting that the the content in {} gets removed for purposes of comparability

view this post on Zulip Lloyd McKenzie (Jul 25 2017 at 23:29):

That's somewhat tempting there too, but at least it can be argued that the {} content may influence the interpretation of the code by some systems. The display values doesn't change the meaning of the code at all.

view this post on Zulip Robert McClure (Jul 26 2017 at 01:21):

Careful Lloyd, you are being too literal and computationally minded. What is inside the {} does provide useful meaning, it just dosen't change how a computer compares aligned values associated with the unit.

view this post on Zulip Yunwei Wang (Jul 26 2017 at 14:28):

The discussion is very interesting. But it is a little bit off topic now. For the original question, my understanding is that this is not a problem so we don't need any tracker item. Server implementation should be able to handle | character in its meaningful way.

view this post on Zulip Yunwei Wang (Jul 26 2017 at 14:28):

Is that correct?

view this post on Zulip Rob Hausam (Jul 27 2017 at 11:57):

That's my take, Yunwei.


Last updated: Apr 12 2022 at 19:14 UTC