FHIR Chat · ValueSet defined by an intersection ? · terminology

Stream: terminology

Topic: ValueSet defined by an intersection ?


view this post on Zulip Mark Kramer (Mar 17 2022 at 12:18):

(I think the answer to this is no, but I'll ask anyway) Is there a way to define a value set as the intersection of two other value sets? Alternatively, and specifically for SNOMED, is there a way to use filter criteria (or any other clever tricks) to define a value set as those concepts that are common descendants (is-a hierarchy) from two or more given parent concepts?

view this post on Zulip Rob Hausam (Mar 17 2022 at 12:24):

Yes, there is. ValueSet.compose.include.valueSet is 0..*, and the comments on compose.include state "If one or more value sets are listed, the codes must be in all the value sets." - i.e. the include is the intersection of the specified value sets.

view this post on Zulip Mark Kramer (Mar 17 2022 at 12:58):

Rob, we have never used ValueSet.compose.include like that, and I don't think that is how FHIR interprets that element. For example, mcode-radiotherapy-technique-vs is a value set composed of value sets, which in turn are made up of other value sets. The expansion is the UNION of all the concepts from all those value sets.

view this post on Zulip Mark Kramer (Mar 17 2022 at 13:00):

Looking on this page, the definition of that element says "Selects the concepts found in this value set (based on its value set definition). This is an absolute URI that is a reference to ValueSet.url. If multiple value sets are specified this includes the union of the contents of all of the referenced value sets."

view this post on Zulip Mark Kramer (Mar 17 2022 at 13:04):

Tracking this down, the definition changed between 4.6.0 and 4.0.1. "Union" changed to "intersection". That's a HUGE difference, and not NOT backwards compatible AT ALL. As a normative resource, that won't fly.

view this post on Zulip Rob Hausam (Mar 17 2022 at 13:07):

Right. Because that was a mistake in R4 (and earlier) documentation and wasn't the intent, and it's being being corrected in R5. I think the Jira task is a technical correction - I can check that.

view this post on Zulip Mark Kramer (Mar 17 2022 at 13:10):

Ouch, ouch, ouch, ouch. No matter what the intent, that is huge breaking change. I don't think the intersection is the right intent anyway, but that's not even the issue. The issue is whether this is a breaking change on a NORMATIVE resource (YES IT IS).

view this post on Zulip Chris Moesel (Mar 17 2022 at 13:20):

If every existing non-ballot version of FHIR treats it as a union, and every existing FHIR tool treats it as a union, and every IG that has ever used it has it expanded as a union, then why change it? What possible upside could justify all the breakage and confusion? It seems to me that we should treat it as the de-facto (or common-law) intent now -- and add in a different mechanism that supports the intersection intent. I agree w/ @Mark Kramer -- this is for sure a breaking change.

view this post on Zulip Rob Hausam (Mar 17 2022 at 13:20):

This was addressed in FHIR#25179 (and that issue is a CR, rather than a TC). I agree that the change is breaking - and that isn't expected to happen with a normative resource (at least unless the entire community has agreed to the change). The issue doesn't include much detail of how this change was justified, but the key part related to that I think is:

Update:10/27/19: Grahame has confirmed that the definition of compose.include.valueset appears to be erroneous, when it was moved under compose.include (previously it was under compose), _ the definition was supposed to be updated

The key question, I guess, is what to do now?

view this post on Zulip Chris Moesel (Mar 17 2022 at 13:28):

Ah. I see. Other parts of the spec say it is an intersection -- so we have contradictory definitions in different normative parts of the spec. Fun!

view this post on Zulip Rob Hausam (Mar 17 2022 at 13:29):

Correct.

view this post on Zulip Rob Hausam (Mar 17 2022 at 13:36):

On balance, I think the FHIR#25179 resolution is likely still going to be the best course to take.

view this post on Zulip Mark Kramer (Mar 17 2022 at 13:38):

(deleted)

view this post on Zulip Chris Moesel (Mar 17 2022 at 13:39):

I think the kicker, though, is that FHIR IG Publisher has not treated it this way at all. For example, Mark's mcode-radiotherapy-technique-vs that he links above has a single include with multiple value sets. Not only does the expansion contain the union of those sets, but even the generated narrative describes it that way:
image.png

view this post on Zulip Rob Hausam (Mar 17 2022 at 13:40):

@Grahame Grieve?

view this post on Zulip Chris Moesel (Mar 17 2022 at 13:41):

I think the intent of treating it as an intersection makes sense, but when the spec is at odds with itself, and the reference implementation tooling (which is required for all HL7 publications) has taken a side already (for many years now) -- it makes sense to normalize to the position that has been widely practiced and enforced by official tooling.

view this post on Zulip Rob Hausam (Mar 17 2022 at 13:46):

Possibly. But that still means that we would have to change the further normative guidance on 'include' that contradicts this and carve out 'valueSet' as a special case, which also eliminates the possibly (at least without further additions or changes) of defining value sets based on the intersection of other value sets - which is what @Mark Kramer is requesting. I don't think that's a very easy choice to make, either.

view this post on Zulip John Moehrke (Mar 17 2022 at 13:48):

non-breaking is to introduce a new word that does what you wanted intersection to do... thus leave intersection implemented as is 'commonly implemented today', which should be explained and warned against using.

view this post on Zulip Mark Kramer (Mar 17 2022 at 13:52):

One correction on the above -- the example I cited actually has 2 includes, with 1 value set in each. So UNION is the correct, expected behavior for that example. My mistake, mucho apologies for any confusion!

view this post on Zulip Rob Hausam (Mar 17 2022 at 14:03):

I have value sets defined in IPS which use 'valueSet' and 'filter' in a single include, and the Publisher and expansion behavior is intersection, as expected. I will check further and see if I have any that use intersection of two (or more) value sets (I don't recall for sure whether I have that or not).

view this post on Zulip Chris Moesel (Mar 17 2022 at 14:21):

I've just created a test project that has a ValueSet w/ one include and two value sets. The IG Publisher expands it as a UNION. The generated narrative is a bit more ambiguous, but the expansion is definitely as union (not an intersection). So it does seem that the IG Publisher is following union semantics even for multiple value sets in a single compose.include. Here's the test project if you want to try it yourself: ValueSetIncludesValueSets.zip

view this post on Zulip Michael Lawley (Mar 17 2022 at 14:26):

FWIW, Ontoserver has always (~7 years now) treated this as an intersection

view this post on Zulip Josh Mandel (Mar 17 2022 at 14:41):

So it does seem that the IG Publisher is following union semantics even for multiple value sets in a single compose.include. Here's the test project

Good to know! Another relevant point would be whether this is actually being relied upon by any real implementation guides. Of course there could be others we are not aware of, but as Mark pointed out above there are more straightforward ways to get unions, so it's not clear that IG developers would have stumbled into this particular trap.

view this post on Zulip John Moehrke (Mar 17 2022 at 15:00):

It is not uncommon as an IG author to ... simply throw things at the wall and see if it looks like what you need... the number of people that fully understand the power of the whole ecosystem is small... we can't always wait for one of the few FHIR gods to bless what we want to do... so if the tooling produces what we want, we go with it.

view this post on Zulip Chris Moesel (Mar 17 2022 at 15:17):

To Josh's point, someone could probably build a script to run through all the IGs in the registry (and CI), and find how many use this construct (single include w/ multiple value sets). That would give us a better idea of the true impact such a change would have.

view this post on Zulip Lloyd McKenzie (Mar 17 2022 at 15:49):

My vote is that existing use weighs more than original intent

view this post on Zulip Josh Mandel (Mar 17 2022 at 16:04):

simply throw things at the wall and see if it looks like what you need

100%. My intuition is that people throwing things at the wall would be unlikely to reach this particular dark corner, but as Chris says it's a question that could be answered with real data. It'd be neat to have an openly published (i.e., pre-scraped) DB of every FHIR resource from every public IG, to facilitate this kind of analysis.

view this post on Zulip Mark Kramer (Mar 17 2022 at 18:08):

It'd be neat to have an openly published (i.e., pre-scraped) DB of every FHIR resource from every public IG

@Josh Mandel Holy grail, you say? Oh, I have one of those. See https://www.medrxiv.org/content/10.1101/2022.03.09.22272163v1. Let me do a bit of analysis and I can tell you the answer to that question...

view this post on Zulip Mark Kramer (Mar 17 2022 at 18:31):

So the answer is 4 value sets use multiple value sets in a single include, out of 125 IGs surveyed that collectively contain 1124 value sets. They are:

  1. http://fhir.infoway-inforoute.ca/io/psca/ValueSet/substanceandpharmaceuticalbiologicproductcode
  2. http://hl7.org/fhir/us/dental-data-exchange/ValueSet/dental-anatomy
  3. http://hl7.org/fhir/us/qicore/ValueSet/qicore-negation-reason
  4. http://hl7.org/fhir/ca/baseline/ValueSet/vaccinecodes

Here is the spreadsheet containing the data:
VALUE_SETS.xlsx
I might be wrong, but it looks like all of these are intended to be unions.

view this post on Zulip Josh Mandel (Mar 17 2022 at 19:12):

Fabulous on so many levels, thanks @Mark Kramer for the analysis. Are the analysis scripts open source too? The ones leading to the figures/tables in your manuscript?

image.png

view this post on Zulip Josh Mandel (Mar 17 2022 at 19:14):

Yeah, agree that all of your identified examples seem to intend union semantics.

view this post on Zulip Mark Kramer (Mar 17 2022 at 19:15):

I haven't shared the notebooks (they are Python3). The code wasn't written for reuse, only for short-term tactical purposes. Certainly not in "library-ready" form.

view this post on Zulip Grahame Grieve (Mar 17 2022 at 20:17):

now that I'm awake... I don't see that the tools treat this as a union. So far as I can see, it's treated as an intersection

view this post on Zulip Mark Kramer (Mar 17 2022 at 20:23):

@Grahame Grieve Take a gander at this: http://hl7.org/fhir/us/qicore/STU4.1/ValueSet-qicore-negation-reason.html
image.png
The expansion is clearly a union.

view this post on Zulip Grahame Grieve (Mar 17 2022 at 20:46):

oh dear.

view this post on Zulip Grahame Grieve (Mar 17 2022 at 20:47):

just like the specification is contradictory, so is the code.

view this post on Zulip Grahame Grieve (Mar 17 2022 at 20:47):

when importing a value set, the code performs both a union and an intersection, in different places in the code.

view this post on Zulip Grahame Grieve (Mar 17 2022 at 20:48):

I missed the union bit happening when I looked at the code. And the intersection bit is quite redundant when it happens after the union happens

view this post on Zulip Grahame Grieve (Mar 17 2022 at 21:00):

but this is worse. tx.fhir.org does not do this, only the validator does for internal simple value sets.

view this post on Zulip Michael Lawley (Mar 17 2022 at 22:37):

@Mark Kramer does your analysis also look at (single) valueset + filter or (single) valueset + concepts ?
Four valuesets from 1124 needing a technical correction does not seem too much of a problem?

view this post on Zulip Mark Kramer (Mar 17 2022 at 23:46):

I could look at that, but I did not. I only looked at multiple value sets because that was the question on the table. However, the spreadsheet attached above has all the necessary data if you want to see for yourself.

view this post on Zulip Mark Kramer (Mar 17 2022 at 23:53):

Fortunately it looks like the demand for value set intersections, up until now, has been vanishingly small. Correcting those few cases where the intent was union but the implementation was faulty will not cause much damage. This whole thread started because I have a case where intersections are necessary, and as a result of today’s discussion, it looks like we can clarify the spec, fix the tooling, and get the job done.

view this post on Zulip Mark Kramer (Mar 17 2022 at 23:55):

My panic about massive breakage was overblown. I apologize about that.

view this post on Zulip Grahame Grieve (Mar 18 2022 at 01:37):

ok. So I've fixed the code. But given that the valueSets that @Mark Kramer has identified (and thanks very much for doing excellent legwork) are published and in production, and I found at least one more value set like that that's not public, it's not very simple to 'correct' them.

So what I've done is that when I'm processing value sets, I check the publication date. if the publication date is prior to the end of this month, the legacy rules apply. But for anything published from now on, the correct processing rules per the FHIR#25179 task apply.
(For other toolsmiths (@Ewout Kramer FYI), I use the package date in the npm package to decide this. Core packages don't have dates; I looked them up from here: http://hl7.org/fhir/directory.html)

In addition, anytime the validator sees a value set, it creates this warning:

image.png

view this post on Zulip Michael Lawley (Mar 18 2022 at 01:52):

Do these "legacy rules" only apply in the context of the IG publisher? We have people relying on the intersection semantics today; we need to avoid breaking things for them.

view this post on Zulip Rob Hausam (Mar 18 2022 at 02:07):

I expect (and hope) that it will be limited to the IG Publisher context.

view this post on Zulip Elliot Silver (Mar 18 2022 at 02:13):

Mark Kramer said:

  1. http://fhir.infoway-inforoute.ca/io/psca/ValueSet/substanceandpharmaceuticalbiologicproductcode

Grahame Grieve said:

But given that the valueSets that Mark Kramer has identified (and thanks very much for doing excellent legwork) are published and in production.

That valueset is currently in public comment. We can update if needed.

(On a related note, I recall running into an issue where we were defining a valueset A that included some specific concepts directly, and excluded valueset B which contained several of them. I can't remember if we were hoping for the include to override the exclude or vice versa, but found the result was the opposite of our expectations. If we defined valueset C that included the concepts, and then defined valueset A to include C and exclude B, the result differed.)

(@Sheridan Cook )

view this post on Zulip Grahame Grieve (Mar 18 2022 at 02:14):

yes they apply in that context

view this post on Zulip Grahame Grieve (Mar 18 2022 at 02:14):

@Elliot Silver test cases are welcome for this part of the value set space

view this post on Zulip Elliot Silver (Mar 18 2022 at 02:16):

Sure. Let me see if I can find or recreate the issue.

view this post on Zulip Mark Kramer (Mar 18 2022 at 13:38):

@Grahame Grieve Thank you for the timely and clever fix. In the warning message, would it be possible to point out the difference between multiple value sets in one include, versus multiple includes with single value sets?

view this post on Zulip Grahame Grieve (Mar 30 2022 at 00:32):

ok


Last updated: Apr 12 2022 at 19:14 UTC