Stream: terminology
Topic: non-typical Codesystems and Valuesets
Bob Milius (May 04 2020 at 16:01):
I started this thread in the shorthand stream about valuesets and codesystems that are often used in genomics.
@Grahame Grieve commented:
looking back at what Bob is doing...
- sounds like it should be discussed on the terminology track to see what should be done. Typically, these gene terminologies are very large, and you can't simply define them
- more likely, I'll need to do implementation work on tx.fhir.org to support these kinds of terminologies
In addition to gene terminologies defining gene names, some of these systems are only grammars (HGVS or GLStrings) that we need to bind to.
So, I'm bringing it up here in this track for to see what should be done.
Michael Lawley (May 04 2020 at 23:20):
@Alejandro Metke has done a bunch of work in the genomic terminologies space.
There's also work out of Nebraska in the pathology space where I believe they've mostly used string-typed concrete domains to build SNOMED extensions where the HGVS string, for example, is referenced as a property of SNOMED code for a test.
Alejandro Metke (May 04 2020 at 23:39):
The main work we've done is around transforming OWL ontologies to FHIR code systems (it is an open source project available here: https://github.com/aehrc/fhir-owl). I know the Clinical Genomics working group has been using this for some of the ontologies that are used in genomics because a few are available in OWL format. The only issue here is to standardise the systems' urls.
I think the difficult part is dealing with grammar-like terminologies. FHIR already uses UCUM, which has similar characteristics, but I think it would be good to have more clarity around 1) if terminology servers should support this kind of terminology 2) if yes, what is the expected behaviour (which operations make sense). In my view the answer to 1) is yes and 2) some operations could be clearly supported, such as $validate-code, but some others are not as clear, for example, $expand.
Grahame Grieve (May 05 2020 at 00:23):
you could make a task if you think we should be clearer, but I thought we already were - a terminology server that implements a code system with a grammar should have the following behavior:
- it should be clear whether the grammar is supported or not (e.g. snomed). (terminology capabilities)
- we could say for some of them in their definition that it servers are required to implement the grammar (e.g. media type, language) but we don't. UCUM would be a candidate to for being required but I'm not sure
For servers that do support the grammar:
- $validate-code should understand the grammar correctly and validate the semantics of the grammar as completely as possible
- $expand is more complicated:
- if the server is expanding an enumerated value set, it should only add valid expressions to the expansion
- for some code systems that have a grammar, we have a filter to exclude the grammar (pre-coordinated terms only). For others that's not appropriate
- servers should be able to do expansions without grammar if the filter says so
- servers can't expand a value set that allows for post-coordinated terms, but it may choose to return a subset (including post-coordinated terms or not) and mark it as indicative since the full expansion cannot be generated
Robert McClure (May 05 2020 at 00:23):
I'd love to have FHIR binding support something that is in essence a syntax check, ie: is the code defined one that is in the code system space based on conforming to syntax rules. This would be used for bindings where the code system is a grammar and the element intends to allow any valid code. This actually turns out to be very common for grammar code system bindings (UCUM, BCP47) and the alternative of creating some subset invariably results in a missed code that is valid and needed. @Grahame Grieve @Rob Hausam - thoughts?
Grahame Grieve (May 05 2020 at 00:24):
that's what we already have
Robert McClure (May 05 2020 at 00:26):
We don't have a binding that special for this. Are you saying you implement that way in your server?
Grahame Grieve (May 05 2020 at 00:26):
that's what it means to support those code systems. There's nothing special about that
Michael Lawley (May 05 2020 at 00:28):
/CodeSystem/$validate-code
should check grammar (as Grahame said above)
Robert McClure (May 05 2020 at 00:28):
I'm not following what you mean by "that is what it means to support those code systems." The resource for this code system can not look like a SCT resource
Robert McClure (May 05 2020 at 00:28):
validate, yes. "support" no
Grahame Grieve (May 05 2020 at 00:29):
if you mean, you cannot express any of this in a code system resource, then sure, you're right. If you mean something else, I don't know what you do
Robert McClure (May 05 2020 at 00:31):
Seems you must be doing something special to treat the two different types of code systems "the same" with validation when the base resources I'd assume they operate on are completely different. Hence not easy or understandable for normal folks
Grahame Grieve (May 05 2020 at 00:32):
I don't understand this response at all. What base resources do they operate on?
Michael Lawley (May 05 2020 at 00:33):
If a grammar were simple, for example characterised by a regex, then the CodeSystem could include this (in an extension element) and the terminology server would be able to provide some support generically.
Challenges arise when there's a normalisation step required (eg order of terms in a grammar is not important), or the grammar is complex, requiring ANTLR, ABNF, etc to express.
Grahame Grieve (May 05 2020 at 00:33):
right. we never tried to make the code system resource express any of those things. It only expresses simple non-grammar code systems
Robert McClure (May 05 2020 at 00:33):
I'm sure I'm the one confused. Does BCP47 and UCUM have a code system resource with concepts in it? I'd assume not. Yet for all the other code system resources, they do. I'd have assumed you need a code system resource to do validation operations on the code system. Not true?
Robert McClure (May 05 2020 at 00:35):
SO you are saying that validate-code is special sauce done independent of the code system resource
Robert McClure (May 05 2020 at 00:35):
and seems is completely up to the TS creator to figure out?
Robert McClure (May 05 2020 at 00:36):
that does not sound very easy for others to implement
Michael Lawley (May 05 2020 at 00:36):
Well, the idea is that complexity is offloaded to the TS implementer
Michael Lawley (May 05 2020 at 00:37):
The expectation is that the number of code systems with grammars etc is small
Robert McClure (May 05 2020 at 00:38):
and the number of TS devs is also small
Robert McClure (May 05 2020 at 00:38):
and will forever be small ;-)
Robert McClure (May 05 2020 at 00:38):
given that UCUM, BCP47 are everywhere
Grahame Grieve (May 05 2020 at 00:40):
we have not tried to make a CodeSystem resource for BCP47, or UCUM etc. My server supports many code systems that I have not tried to make CodeSystem resources for.
I'd have assumed you need a code system resource to do validation operations on the code system
No. All the working terminology servers have some internal API that implements the code systems, and one internal implementation of the API takes a code system resource
validate-code is special sauce done independent of the code system resource
Not dependent on having one, no
that does not sound very easy for others to implement
no, but the question isn't 'how can it be easy to support such terminologies' since there'll never be an easy way. The question was, 'what's the least worst way to do it'. And having worked with some of our early attempts to define a uber-grammar that supports all the things people want to do, and looking ahead at the genomics ones, I knew that I wasn't going to even try for an uber-grammer in FHIR. Instead what we did was simple: define the CodeSystem resource to make sure the long tail is easily supported, and ensure that the terminology service API can handle the requirements of the complicated grammars, and then let servers decide which of those they were going to support
Grahame Grieve (May 05 2020 at 00:41):
the internal mini-terminology server in the validater/publisher looks to see if there's a CodeSystem resource. If there is, it uses it's internal implementation for $expand/$validate-code. Otherwise it hands it over to the terminology server
Robert McClure (May 05 2020 at 00:42):
My point is, perhaps we make it easy to bind to a known implementation of a validator service, versus hope there is a accessible TS that does it for you. I get the long-tail piece and that is exactly my point. Allow the binding to find a validator versus hope there is one inside your system.
Grahame Grieve (May 05 2020 at 00:42):
no I have no idea what you are talking about. What's the difference between a validator service and an accessible TS?
Michael Lawley (May 05 2020 at 00:44):
Perhaps the missing piece is TS federation?
Robert McClure (May 05 2020 at 00:44):
most implementations will use internal FHIR services. ON occasion they will need this sort of external help. How do they link to the TS when needed? - I may call you
Rob Hausam (May 05 2020 at 00:47):
This is essentially what the idea of terminology services is all about. The complexity is handled by the "specialist software", making it much easier for applications generally to make use of terminology and do that correctly.
Grahame Grieve (May 05 2020 at 01:02):
What @Robert McClure is looking for is to use http://hl7.org/fhir/StructureDefinition/valueset-trusted-expansion on the binding, instead of just being allowed to do so on the value set. (we just talked). He's going to:
- make some test cases for the validator to get the ever-so tardy author of the validator to actually implement these things
- make a jira task to extend the extension to us on bindings as well as in value sets
Grahame Grieve (May 05 2020 at 01:12):
@Michael Lawley the problem with federated terminology for me is that the engine (+ internal API) running tx.fhir.org was never defined with federation in mind. The architecture assumes extremely low latency (e.g. direct code bindings) between the different code system providers. That's probably not a capability that we actually exercise, but I'd either have to rewrite my internal API, or somehow turn the internal highly granular API back to something coarse.
E.g. the internal expansion approach is something like
- 'here's a filter - can you handle that?'
- ok set up to be able to filter with this code/op/value
- then a voting protocol for which filter is going to drive the iteration and which other ones are just going to veto the iteration steps (iterate on the smallest overall set)
- then iterating, letting each filter remove things that don't meet the filter
- then assemble the expansion, hierarchically if possible (the iteration keeps track of this)
I can't federate that
Grahame Grieve (May 05 2020 at 01:13):
I do have experience with federating tx services, because I already do - I maintain the mini-terminology server inside the validator. And I know just where it's broken, and the edge cases that no one else has found yet. One day I'll have to figure out how to resolve those edge cases
Michael Lawley (May 05 2020 at 01:16):
That seems pretty low level to want to federate? Why not at the $expand / $validate-code level?
I'm thinking something like a fail-over approach (try internal and on "unsupported", try next federated server), and/or allow specified CS/VS to be routed directly to a federated TS
Grahame Grieve (May 05 2020 at 01:19):
well, the second is the trusted expansion source. The first... doesn't really federate Tx services - it just provides some fall back functionality
Michael Lawley (May 05 2020 at 02:31):
So, trusted-expansion points at a FHIR TS endpoint? eg https://genomics.ontoserver.csiro.au/fhir ?
Grahame Grieve (May 05 2020 at 02:42):
y
Michael Lawley (May 05 2020 at 03:28):
FHIR#27028 ticket to clarify this in the spec
Last updated: Apr 12 2022 at 19:14 UTC