FHIR Chat · Using ValueSet.compose instead of ValueSet.expansion · hapi

Stream: hapi

Topic: Using ValueSet.compose instead of ValueSet.expansion


view this post on Zulip Robert McClure (Sep 09 2019 at 17:32):

@Grahame Grieve @James Agnew @Rob Hausam I was just told something that I hope I'm misunderstanding about HAPI. That HAPI uses the compose element as the list of value set members, i.e.: what I think of as the expansion. I suspect this is a convenience work-around but I find it very concerning if correct because for at least one situation I'm working with they only use this functionality to assess value set content within their FHIR service and that means they take every value set no matter how it is defined, and make a copy and actually use an enumerated compose "because that is how HAPI works." I have all sorts of guesses as to what HAPI actually can support - hopefully it can actually use valuesets correctly and looks at the expansion content, and perhaps that only is available if you do more with the terminology server, I'm not sure.

So can folks explain to me what HAPI can actually do wrt using the value set expansion as the actual list of value set members? But perhaps more importantly, I am very dismayed that it seems (correct me, and others, if this is not correct) that base HAPI uses the compose as the list of value set members. If that is true I'm going to ask that we improve HAPI to make the creation of a value set expansion for any value set resource and then ALWAYS USE THE EXPANSION when doing something that requires the value set membership. Therefore remove the hack to looks at the compose as if it's the expansion. Obviously if the server can not generate a current accurate expansion, then it uses what is in the compose as best it can, but it would always create a resource with an expansion then the value set membership is needed.

Am I missing something?

view this post on Zulip James Agnew (Sep 09 2019 at 17:54):

Hi Rob,

This was the case in earlier versions of HAPI FHIR- Around the time of the CodeSystem/ValueSet split we initially designed the validator to just look at the list of includes and naively trust it.

At this point that code has mostly been replaced, although it still persists for a few hard-to-expand valuesets (BCP47 and UCUM in particular).

It's probably worth mentioning that there is a huge rewrite of the entire way we deal with ValueSets working its way through the system (as a result of some awesome work by @Diederik Muylwyk ). It's not yet enabled by default, but should be soon- Diederik's work involves having ValueSet expansions be pre-calculated at the time the VS is uploaded. This allows us to use them for validation without worrying that we're paying a performance penalty on each expansion.

view this post on Zulip Robert McClure (Sep 09 2019 at 18:27):

That sounds good. I'd like to get a better understanding of what this means, and how the "older HAPI" functioned while in ATL. Can we set up a time to do that - 30min or so?

As for dealing with what I take to mean "older" versions of HAPI - I'd like there to be something done to clarify and perhaps help users fix/improve/whatever whaat I take to be a very unfortunate hack. Again, perhaps I'm not understanding something here and if it really is not a "hack" then help me understand what is happening. But the result of this is exactly what I said in the OP - FHIR devs assume that proper function of FHIR servers is to use the compose and that using the expansion is only if it's easy and needed.

view this post on Zulip James Agnew (Sep 09 2019 at 18:49):

Sure- happy to walk through this as best as I can remember it. Honestly though, there isn't much I can give you in terms of rationale: We got it wrong the first time, and rewrote it later to get it (mostly) right. I'd love it if that never happened, but a codebase the size of HAPI's is always going to have compromises..

view this post on Zulip Robert McClure (Sep 09 2019 at 19:08):

Understood. But I'm surprised and dismayed at the misunderstanding that is creating in the dev/implementer world and I'd like to work together to fix this.

view this post on Zulip Grahame Grieve (Sep 09 2019 at 19:56):

a few comments:

  • I'm not sure about this - are you taking about the Java validator? it's always used expansions
  • there is UCUM code available in java - i wrote it - that can serve up the knowledge for a UCUM code system, though many UCUM value sets are not expandable no matter how much you know about UCUM. I wrote code for BCP-47 that is solid, but that's in pascal. But it's not much code - easy for someone to migrate to java
  • I don't understand about pre-calculating expansions. I've been ignoring the PR and related communications, but maybe I shouldn't have. there's an infinite number of expansions for each value set, depending on the applicable conditions of expansion

view this post on Zulip Grahame Grieve (Sep 09 2019 at 19:57):

I'm surprised and dismayed at the misunderstanding that is creating

hah. it's scary what kind of misunderstandings are out there in implementer land

view this post on Zulip Robert McClure (Sep 09 2019 at 21:41):

@Grahame Grieve I'm not sure what HAPI is actually doing, I just know that working with devs on a project, they were taking the expression-based compose I gave them and instead using the expansion list I also gave as an enumerated compose because "HAPI works with the compose concepts, and we don't need the expansion."

view this post on Zulip James Agnew (Sep 09 2019 at 22:39):

This isn't a validator thing (I think?) so much as an issue with the original way that HAPI's ValueSet expander worked. When you use the HAPI validator it doesn't go out to an external term svc, but rather is performs an expansion locally and uses that to validate against. The original ValueSet expander made some bad assumptions about how expansion should work when codes were explicitly included.

UCUM and BCP47 are definitely slated to be included in HAPI's term svc soon. For UCUM I'm planning on using your Java implementation Grahame, and for BCP47 it looks like the one that's built into the JDK should suffice (honestly I doubt it's as robust as the pascal implementation, but it's probably good enough for most cases).

I don't understand about pre-calculating expansions. I've been ignoring the PR and related communications,

The general gist of this is: Say I upload a ValueSet to HAPI's terminology service that expands to a very large set of codes (for example, a LOINC ValueSet for all codes with a SCALE=Qn). Currently if you try to expand that or validate a code against it, it has to calculate the whole thing right then and there and that will either run out of memory and abort with an ETooCostlyException, or it'll succeed but take forever.

What pre-expansion does is kick off a background job that calculates the expansion and keeps it in a dedicated set of database tables, so that it can be instantly expanded by a client on demand, or validated against.

view this post on Zulip Grahame Grieve (Sep 10 2019 at 00:08):

so I understand the value of precalculating, but what I don't understand is how the challenges are dealt with. Each expansion is different, based on the parameters

view this post on Zulip James Agnew (Sep 10 2019 at 00:58):

Are expansions with parameters are really that important compared to non-parameterized ones? Personally probably 99% of the time that I've seen expansions it's been for the purposes of validation, so no params needed.

Even then, a few of the params are supported (count, offset, includeDesignations). We're still working out how to incorporate filter into those results too, but that needs a bit more lucene magic to get right.

view this post on Zulip Grahame Grieve (Sep 10 2019 at 03:40):

important parameters:
- filter/offset/count/notForUI - for UI work
- language / designation control - for both UI and validation
- date - for working with old records

the others are more edge cased based. @Michael Lawley might have more to say on which parameters matter.

For me, the germane one here is date since it changes the record set completely, and there's all sorts of clinical use where it matters

view this post on Zulip Michael Lawley (Sep 10 2019 at 04:50):

The HAPI behaviour is one reason why we went to the trouble of adding dedicated $validate support to Ontoserver :-)

For us (me), displayLanguage and includeDesignations are important, as is activeOnly, and ability to specify ValueSet version (ie url and valueSetVersion)

We don't support date because we believe version is the appropriate parameter to be using to talk about historical data (e.g., date doesn't tell you anything about what was actually used / available for use on that date).

view this post on Zulip Michael Lawley (Sep 10 2019 at 04:51):

In general, old records need careful handling depending on what it is you're trying to do (validate, query (search), etc)

view this post on Zulip Grahame Grieve (Sep 10 2019 at 04:53):

We don't support date because we believe version is the appropriate parameter

note that this is making resolution the clients problem, not the servers. Why is a client better suited to resolve the date question? (which I agree, it's hard)

view this post on Zulip Michael Lawley (Sep 10 2019 at 04:55):

because the client is the only one in a position to know what question it has

view this post on Zulip Michael Lawley (Sep 10 2019 at 04:56):

eg expand the valueset using the latest valid version of the valueset and codesystem on date X
cf expand the valueset using the version of the valueset and codesystem that we in the Tx on date X

view this post on Zulip Michael Lawley (Sep 10 2019 at 04:57):

As a server, how do I decide which of those semantics to apply? Whichever I choose is likely to be wrong for some portion of clients

view this post on Zulip Bryn Rhodes (Sep 10 2019 at 05:54):

We take the same approach with measure evaluation, the client specifies the value set version to use, and the value set version definitions capture the code system versions to be used. It happens to line up with a date, but it's explicitly specified as part of a completely separate input to the process (called the binding parameters specification).

view this post on Zulip Grahame Grieve (Sep 10 2019 at 06:21):

well, that depends on how well informed the server is about the system configuration. Basically, in order to use date, you need to be able to know who the client is, and the past history of value set usage for the application the client is part of. Some terminology servers won't know that (yours) but others - the kind of server that can resolve the context parameter - these are the ones that can use date

view this post on Zulip Michael Lawley (Sep 10 2019 at 06:27):

I don't follow - Ontoserver can resolve the context parameter (pointing into a profile) but that doesn't really help with resolving a date to a codesystem version (business version cf technical version)

view this post on Zulip Grahame Grieve (Sep 10 2019 at 06:28):

that implies that you are managing profile versions, which implies.....

view this post on Zulip James Agnew (Sep 10 2019 at 10:16):

This all suggests that version is the next thing we need to get working- although that doesn't seem like an issue as far as precalculation is concerned. Precalculating a finite set of valueset versions won't be a big deal, once our codesystem support is version aware (which it is not yet..)

Do people really use filter as a part of validation?That seems risky unless you're designing for a specific term server and its implementation of filter...

view this post on Zulip Grahame Grieve (Sep 10 2019 at 14:26):

Filter is not for validation but for UI where it’s fundamental

view this post on Zulip James Agnew (Sep 10 2019 at 17:28):

Cool ok, that's what I had been assuming.

I'm less concerned with blazing fast performance where UIs are concerned, kind of paradoxically, so things like filter don't matter if they don't take advantage of the pre-expansion for now since a user can handle a 500ms pause.. it's the validator where a 500ms pause quickly spirals into unacceptable territory.

view this post on Zulip Grahame Grieve (Sep 10 2019 at 19:59):

so the performance advantage only kicks in for the right parameters? then let's see what parameters the validator uses on $validate-code, which is what is in context here:

view this post on Zulip Grahame Grieve (Sep 10 2019 at 20:00):

  • displayLanguage is critical; this is set in any non-english context, and probably in english contexts over time

view this post on Zulip Grahame Grieve (Sep 10 2019 at 20:03):

that's the only one set by the infrastructure. Otherwise, validator users can provide their own. the most likely parameter to provide is :

view this post on Zulip Grahame Grieve (Sep 10 2019 at 20:04):

  • system-version (or force-system-version) to specify which snomed edition to use (this is mandatory for the IG publisher context, where it's the only way to tell tx.fhir.org which snomed version to use)

view this post on Zulip Grahame Grieve (Sep 10 2019 at 20:05):

no - here's another from the infrstructure:

  • abstract - whether codes that are 'abstract' are allowed in this context

view this post on Zulip Grahame Grieve (Sep 10 2019 at 20:10):

that's the ones that matter for validation

view this post on Zulip James Agnew (Sep 10 2019 at 20:43):

interesting. this is a cool summary.

So:

  • displayLanguage seems easy enough.. We store the all of the designations in the precalculated expansion already. I'm assuming the expected implementation here is to just grab the appropriate designation and swap out the code.display value with the one from that designation? And... a specific use code maybe?

  • systemVersion is a roadmap thing for sure. HAPI's term svc doesn't currently understand the concept of multiple versions of the same codesystem existing in the first place, but that's an obvious next thing for us to address. I don't understand the description of force-system-version at all though.

  • abstract probably means we need two indexes on each precalculated expansion.. one with all codes and one with only non-abstract codes

view this post on Zulip Grahame Grieve (Sep 10 2019 at 22:09):

displayLanguage - yes.... for now. we haven't really explored the use code thing yet.

view this post on Zulip Grahame Grieve (Sep 10 2019 at 22:09):

force-system-version: you want to override whatever version is specified in Coding.version or the value set, and use your own specified version

view this post on Zulip Grahame Grieve (Sep 10 2019 at 22:10):

this is about decay and rot in the definitions while the underlying licensed terminology rolls forward (looking at you, snomed)

view this post on Zulip Grahame Grieve (Sep 10 2019 at 22:11):

abstract... not sure whether it's worth it. differentiating isn't something we do in production yet, but it will be coming. Once it comes, we'll generally be specifying it to false - the bulk of records don't use abstract codes, while configuration / decision support records can (and often do) but they get validated rarely, so it might be jsut worth falling back to the slow method in that case


Last updated: Apr 12 2022 at 19:14 UTC