FHIR Chat · common API · ontology

Stream: ontology

Topic: common API


view this post on Zulip Erich Schulz (Jun 02 2016 at 11:33):

I'm starting to think about reusable software componenents (see also the other thread in implementers stream) and it seems that most of the interesting common operations will require injection of some kind of ontology service. This has me wondering if any thought has gone into defining a standard API for such a service?

view this post on Zulip Grahame Grieve (Jun 02 2016 at 11:34):

what do you think it would do, and why would FHIR define such a thing? surely that's a core ontology w3C thing?

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:34):

(I'm thinking to identify simple, common operations with no external dependencies to explore initially - but it should also be possible work with an injected service)

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:36):

well a common operation would be a apply a mapping eg given a problem list with mainly SNOMED codes, generate a list in ICD 10 code

view this post on Zulip Michael van der Zel (Jun 02 2016 at 11:36):

SparQL like?

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:36):

also classify the problems by body system...

view this post on Zulip Grahame Grieve (Jun 02 2016 at 11:37):

Erich, you should start by reading the terminology service

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:37):

maybe @Michael van der Zel ...

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:38):

is that a FHIR thing @Grahame Grieve ?

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:38):

(sorry appreciate this is bit of a noob question)

view this post on Zulip Grahame Grieve (Jun 02 2016 at 11:38):

yes http://hl7-fhir.github.io/terminology-service.html

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:39):

bingo! thanks @Grahame Grieve

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:41):

i'd looked quickly at the https://www.hl7.org/fhir/valueset.html and related resource but had missed this page

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:43):

is there a list of implementations ?

view this post on Zulip Grahame Grieve (Jun 02 2016 at 11:45):

my server. Ontoserver, Apelon. IMO. NLM. We're starting to prepare for certification of the services

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:49):

wow

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:53):

ok so then (thinking in javascript sorry) it should be possible to define an API as a lightweight wrapper around this (is it REST?) service and then this service could be an injectable dependency into a library that performs "common simple operations" on FHIR data... ?

view this post on Zulip Grahame Grieve (Jun 02 2016 at 11:53):

yep

view this post on Zulip Erich Schulz (Jun 02 2016 at 11:53):

sweeet!

view this post on Zulip Peter Jordan (Jun 02 2016 at 21:23):

I look forward to seeing the certification criteria for FHIR-based Terminology Services. Will the Connectathon Tests form the basis of these requirements?

view this post on Zulip Grahame Grieve (Jun 02 2016 at 21:24):

well, that's the process that will lead towards the certification tests

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:33):

I had a read of http://hl7-fhir.github.io/terminology-service.html now

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:34):

it gets a bit wooly around the "closure" table

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:34):

(we used to call them "ancestor tables")

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:36):

I guess I should have a hard poke at the onto server - then I maybe up for a few patches

view this post on Zulip Grahame Grieve (Jun 03 2016 at 10:36):

closure we haven't tested yet. Getting there

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:38):

idea is for client to incrementally build its own table holding all possible is-a links between a set of "codes of interests"?

view this post on Zulip Grahame Grieve (Jun 03 2016 at 10:44):

yes, client knows the table, but not the grounds on which it is built - that's what the terminology server knows. My server supports closure, but it's the only one at this time

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:45):

ah

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:45):

so this depends on the server keeping a track of which links it has let the client know about?

view this post on Zulip Grahame Grieve (Jun 03 2016 at 10:46):

y

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:46):

that could get expensive with a lot of clients...

view this post on Zulip Grahame Grieve (Jun 03 2016 at 10:47):

well, up to the server to decide how to manage that. I'll let users fill up my database, and then I'll just wipe it ;-)

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:47):

heh

view this post on Zulip Grahame Grieve (Jun 03 2016 at 10:48):

of course if a user runs my server locally, they can have whatever policy they want

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:50):

btw are we sure "closure" is the correct term?

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:51):

https://en.wikipedia.org/wiki/Closure_(computer_programming)

view this post on Zulip Grahame Grieve (Jun 03 2016 at 10:51):

yes. see http://dirtsimple.org/2010/11/simplest-way-to-do-tree-based-queries.html

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:54):

yeah i read that link

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:54):

actually this seems the derivation

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:54):

https://en.wikipedia.org/wiki/Transitive_closure

view this post on Zulip Grahame Grieve (Jun 03 2016 at 10:54):

y

view this post on Zulip Peter Jordan (Jun 03 2016 at 10:58):

At the Montreal Connectathon, Caroline Macumber from Apelon suggested that they may have implemented the full closure operation, but was going to check with the relevant developer. I started a while back, but wasn't comfortable that I understood the complete use case - notably around notifying and updating clients when the server rebuilds a transitive closure table after a new version of the code system is implemented. A UML Sequence Diagram might be useful; I re-read the spec and Grahame's blog entry several times - but something seemed to be missing.

view this post on Zulip Erich Schulz (Jun 03 2016 at 10:59):

so it seems that "closure" actually just means the property of generating a finite set (in mathematics)

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:00):

the key element is that this based on the transitive is-a links...

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:02):

too be honest I'm thinking that supporting incremental creation of these "is-a closure tables" (?? IsACT ??) via client-server operation is "in the 80%"

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:03):

but I can certainly see the utility in the base operation of "give me the subset of is-a links that involve members both in a given set"

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:03):

? anyone who wants to use a terminology server to handle all their terminology logic will be lead kicking and screaming to maintaining a closure table

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:04):

take hiearchy table => explode

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:04):

sorry, anyone who maintains a relational database who....

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:06):

its a simple enough operation...

view this post on Zulip Peter Jordan (Jun 03 2016 at 11:09):

I persist SNOMED CT in SQL Server and have a transitive closure table; but, to date, profiling and execution plans show no discernible performance difference between the table-valued function that I used for subsumption queries using the core tables, and the one that uses the closure table. However, I suspect that might change when the server is under a heavy load.

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:12):

the key difference is that the closure table built this way is capable of dealing with post-coordination etc

view this post on Zulip Rob Hausam (Jun 03 2016 at 11:12):

@Peter Jordan can you describe your "table-valued function that I used for subsumption queries using the core tables" vs. the "closure table"?
I'm not quite sure what the former one is - it sort of sounds like a "closure table" under the hood (or bonnet):)

view this post on Zulip Peter Jordan (Jun 03 2016 at 11:13):

That makes sense, but I don't support post-co-ordination...yet.

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:13):

@Grahame Grieve - ok that isn't as simple :-)

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:14):

it is for the client, that's the key - it's just a code. It's not interested in the internal details. code, closure table, whatever...

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:14):

how common is post-cordination out there in the real world?

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:15):

uncommon. mostly because of the closure table problem.

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:15):

because everyone pre-generates their closure tables, if they have them, and then they can't deal with post-coordination

view this post on Zulip Rob Hausam (Jun 03 2016 at 11:16):

we especialy haven't explored the post-coordination aspects yet, as far as I know
and I was going to say, I think that presumes that the post-coordinated expression will be assigned an identifier (on the fly)? - which is the SNOMED CT idea of the "expression library"

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:16):

not necessary for the client. Or the API. Server might decide to do that for itself, but that's it's business

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:17):

does the ontoserver to a basic closure (still think a better name is needed) operation?

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:18):

don't know whether Michael and the team have done that

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:19):

i'm getting curious about how long it would take my PC (2years old with 16G ram) too generate an ancestor table on snomed...

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:19):

and then to get postgress to load it...

view this post on Zulip Grahame Grieve (Jun 03 2016 at 11:20):

depends. generating the closure table for sct-us purely in ram takes me 45 seconds

view this post on Zulip Peter Jordan (Jun 03 2016 at 11:20):

@Rob Hausam, I'm not sure if you're familiar with table-valued functions in SQL Server - they're a special type of stored procedure that return a table. The relevant logic uses self-joins on the relationship table, I'd be quite happy to send it to you. Because it uses a larger table than the 2-column transitive closure one, I suspect that it (and/or the relevant indexes) might take up more memory so might be more likely to slow down when SQL Server is nearing its memory peak.

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:23):

gad I'm not getting any work done...

view this post on Zulip Rob Hausam (Jun 03 2016 at 11:23):

yes, I am somewhat familiar with them in general, Peter - but I've tended to do most of my db work in Oracle
I would like to have a look at it

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:24):

this is too interesting

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:24):

we used to use oracle in the 90's because it had connect by

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:25):

which was way too slow compared with a "transitive closure" table

view this post on Zulip Peter Jordan (Jun 03 2016 at 11:26):

Using the Pearl Script supplied by IHTSDO, it takes less than a minute to generate the closure table for the international edition of SCT - on my 64 bit server with lots of RAM. It takes at least 5 mins on my 32 bit test machine.

view this post on Zulip Rob Hausam (Jun 03 2016 at 11:26):

yes, I think "connect by" is a great feature - I've made extensive use of it for this sort of thing
like building a "closure table" (althought I didn't call it that at first) on the fly ("just in time")

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:27):

but looking at this... http://hl7-fhir.github.io/terminology-service.html @Grahame Grieve I am thinking the "incremental build" component of the "closure" operation just seems erm non-core ?

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:28):

a certainly see the generation of the initial table as a core feauture tho...

view this post on Zulip Rob Hausam (Jun 03 2016 at 11:30):

@Erich Schulz yes, the pre-computed full closure table is significantly faster than doing a hierarchical query each time - that's why we adopted the caching approach which builds the closure table as needed, rather than pre-computing all of it, most of which will never be used
just a different approach

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:30):

yes I can see the rationale for serving a subset

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:33):

it was the incremental building of a subsets I was questioning...

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:35):

(at least as part of core API )

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:36):

mainly because it imposes a burden on the server to track its previous messages to clients in a pattern that may not scale awfully well and introduces a bunch of complexities around "time to live"

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:53):

I'm having a look at ontoserver @Grahame Grieve - it looks way more cut-down than http://hl7-fhir.github.io/terminology-service.html

view this post on Zulip Peter Jordan (Jun 03 2016 at 11:53):

The transitive closure table generated from the 20160131 snapshot version of the SCT International Edition has 5,470,090 rows!

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:54):

so if was stored as pair of 64bit bytes...

view this post on Zulip Erich Schulz (Jun 03 2016 at 11:54):

that is ~25M

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:00):

I dont't think ontoserver is much cut down. It does everything that we've worked through so far

view this post on Zulip Peter Jordan (Jun 03 2016 at 12:01):

Storage is cheap, it's all about memory. The table has 2 columns of SCTIDs - IHTSDO recommend that these are persisted as 64 bit integers, but as they aren't true numbers I store them as chars and I know others who do likewise.

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:01):

I think that the incremental closure table is wroth having. it is more work for the server, but we've always said that's a good deal for the client

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:02):

I store the closure table as an array of pairs of 4 byte unisgned int, which each 4 bytes is a look up to an array of string values that represent the codes

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:02):

are many systems doing this currently?

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:02):

but I don't use a database for this.

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:03):

not many, but all the terminology servics have had to do something about this problem in order to support integrated search across terminologies and other things

view this post on Zulip Peter Jordan (Jun 03 2016 at 12:03):

...but what's the trigger for clients to request updates when the server refreshes the closure table?

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:04):

given it is a 45 second operation to rebuild the entire thing and releases come out once a month? (max) I'm struggling to see a high ROI...

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:05):

because you can't build it in advance

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:05):

(just to emphasis am talking about incremental builds... not subseting... subsetting is gold)

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:05):

unless you prohibit post-coordination. which everyone does, but it cripples the tx

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:06):

do you have link for ontoserver subsumption test?

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:07):

the $validate option... what link do you want?

view this post on Zulip Peter Jordan (Jun 03 2016 at 12:08):

Things will be trickier if/when IHTSDO move to more frequent SCT releases.

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:08):

mmm

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:09):

it's a solvable problem. I've got to write a client to help the ontoserver guys test their closure table

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:09):

i guess I'm just flagging 80:20 situation...

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:09):

perhaps opportunity to have inititial simple API definition then a second wave?

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:10):

then you don't follow what the 80:20 is about.

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:13):

?

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:13):

http://www.clipular.com/c/4745096286699520.png?k=oAARjjxSNY7fDU3cgP6HeFBTC2E

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:14):

all the terminology servers implement some kind of feature for managing a closure table

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:14):

sure

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:14):

I can only repeat...

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:15):

transitive closure table = gold

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:16):

incremental building of client-server building of TCT is in a different plane

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:16):

not saying its bad...

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:16):

then you don't need to consume it. nor worry about it.

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:17):

just saying that today its only implemented by a single server...

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:18):

sorry not trying to be difficult

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:19):

the concern is I go looking for the resource on the ontoserver and it isn't there

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:19):

even the base resource as far as I can see

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:19):

which base resource?

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:19):

for a transitive closure table

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:20):

you mean ConceptMap?

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:21):

they have the transitive closure table expressed with conceptmap??

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:21):

concept map is part of it, but I wasn't sure what you meant. But they haven't implemented closure API yet

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:21):

k

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:22):

so what I'm thinking is if the base spec is simple then that implementation can occur more rapidly

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:22):

because I can write a script to make my own

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:22):

well, simple problems are simple. yes. And a client can just do subsumption testing directly, no problems

view this post on Zulip Peter Jordan (Jun 03 2016 at 12:26):

...and store the results each time it requests a new test...and (possibly) be aware (somehow) when the server loads a new version of the code system that might make its cached results redundant. Which begs the question, why does the server need to maintain a record of client closures?

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:29):

i'm thinking it could be useful to have a table of the current servers and the services they provide...

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:29):

if the server doesn't know what subset the client is dealing with, it must return the close for everything, but everything is not finite

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:30):

yes that is true

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:30):

so somehow the client needs to identify the subset it is interested in

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:30):

i agree that is core functionality

view this post on Zulip Grahame Grieve (Jun 03 2016 at 12:31):

well, it can build it gradually, or it can accelerate the process

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:31):

what I'm suggesting may not be core is incremental expansion of the TCT...

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:31):

the key words being "incremental expansion"

view this post on Zulip Peter Jordan (Jun 03 2016 at 12:40):

I'm still missing something here. Although I fully understand why a client needs to persist the results of individual subsumption queries, I can't see the need for a server to (effectively) maintain a record of those results - what value does that add, e.g. does it make it any easier for the client to be made aware when those results may be superseded by a code system update? What's the value proposition for this additional complexity?

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:41):

it would save some bandwidth...

view this post on Zulip Rob Hausam (Jun 03 2016 at 12:49):

the key point is how the client becomes aware that it needs to rebuild its stored transitive closure subset, because the code system has been updated

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:49):

an expiry date? or a "last-updated" service?

view this post on Zulip Erich Schulz (Jun 03 2016 at 12:59):

there are also tools like rsync and git that eat incremental updating for breakfast

view this post on Zulip Erich Schulz (Jun 03 2016 at 13:00):

building this functionality into the core services seems to violate the "do one thing well" principle

view this post on Zulip Peter Jordan (Jun 03 2016 at 21:20):

@Rob Hausam that key point applies to all use cases where a client persists the results of SNOMED CT queries. @Erich Schulz - SNOMED CT updates can either add or inactivate concepts and relationships - versioning is based on the active/inactive status at the specified release date (effective time). Concepts can be activated, deactivated and reactivated - but never deleted.

From a client perspective, I'd persist the relevant CodeSystem version and periodically check it with the one returned by the Terminology Server. When there's a new version, I'd refresh all the persisted query results - subsumptions or otherwise. Therefore, from an EHR/EMR service perspective, I still don't see a use case for a separate, and distinct, process for closure subsets. However, it would be informative to understand the requirements of other categories of terminology service client.

view this post on Zulip Rob Hausam (Jun 03 2016 at 22:03):

I agree, Peter. I think your suggestion is reasonable, and similar to what I've implemented before (although with an "internal" terminology server, rather than an external API). I think you may be right about the need, and the means of achieving it. Yes, we need to consider all (known) client perspectives. I've been intending to spend some time looking at $closure in greater depth, and this discussion is good impetus for doing that.


Last updated: Apr 12 2022 at 19:14 UTC